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AN INDEX OF CONSUMPTION OF FUELS AND WATER 
POWER! 


By F. G. Tryon, Institute of Economics, Washington, D. C. 


Anything as important in industrial life as power deserves more 
attention than it has yet received from economists. The industrial 
position of a nation may be gauged by its use of power. The great 
advance in material standards of life in the last century was made 
possible by an enormous increase in the consumption of energy, and the 
prospect of repeating the achievement in the next century turns 
perhaps more than on anything else on making energy cheaper and 
more abundant. A theory of production that will really explain how 
wealth is produced must analyze the contribution of this element of 
energy. 

These considerations have prompted the Institute of Economics to 
undertake a reconnaissance in the field of power as a factor of production. 
One of the first problems uncovered has been the need of a long-time 
index of power, comparable with the indices of employment, of the 
volume of production and trade, and of monetary phenomena, that will 
trace the growth of the factor of power in our national development. 
The problem presents many difficulties, and we are not sure that it can 
be solved, but a partial answer is given by constructing an index of the 
amount of the raw stuff of energy consumed, the coal, oil, natural gas, 
water power, and energy from minor sources that have been absorbed 
by the country. 

I am indebted to Director Moulton for permission to publish this 
first fruit of the Institute’s study. 

1 Presented at the eighty-eighth annual meeting of the American Statistical Association at St. Louis, 


December 28, 1926. No attempt has been made to bring the statistics in the paper down to date in the 
interval between its presentation and printing. 
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ELECTRICITY THE BEST SHORT-TIME INDICATOR OF POWER 


The best short-time indicator of the factor of power is the business of 
the electric utilities. The monthly statistics of production of electricity, 
published by the Geological Survey,'! have attracted several makers of 
business barometers and are used by the New York Reserve Bank in a 
very useful adjusted index of electric power production. This produc- 
tion series is readily corrected for seasonal change, but it is affected by a 
disconcertingly rapid annual growth, which is less easily determined. 
The uncertainties of the growth factor are minimized by the new series 
which Davis is publishing in Electrical World and which utilizes not the 
production of electricity, but the consumption by groups of identical 
industrial plants.2 Mr. Davis’ early results are most promising and we 
look forward to the development of his electrical barometer with the 
keenest interest. 

When the growth trend is satisfactorily eliminated, as no doubt it 
will be by some device, electricity will furnish one of the most sensitive 
and useful indicators of the volume of production. It can be accurately 
measured and quickly reported; it reflects a great diversity of activities; 
and it is not affected by the changes in price level which impair many 
financial indicators, or by the changes in stocks which impair many 
indicators based on commodity production. 


REQUIREMENTS OF AN IDEAL INDEX OF POWER AND HEAT 


But as a long-time measure of power as a factor in our national life, 
electricity is not satisfactory because it does not go back far enough,’ 
and because it still covers only a part of the total amount of power 
generated. 

The extremely rapid increase in the output of the electric utilities— 
which averages about 10 per cent a year—does not give the net increase 
in the use of power because it includes the replacement of direct steam 
power. It does not even give the net progress of electrification, 
because many industries which formerly generated their own electricity 
are now purchasing current from central stations. 

Moreover, in spite of its rapid development, electricity is still only 
one among many forms of mechanical power. It has become the 
dominant form in manufacturing, and perhaps in mining, but it still 
supplies a very small part of the power used in construction, in trans- 

1 United States Geological Survey, Division of Power Resources, A. H. Horton, Engineer in Charge. 


Production of Electric Power by Public Utility Power Plants, mimeographed. 
2 Davis, Robert M.. “ Electrical Energy Consumption as an Industrial Barometer,’’ paper read at the 


same session. 
3 Electricity production by public utilities was reported for 1902, 1907, 1912, and 1917 by the quin- 


quennial census of electrical industries, but the continuous annual series does not begin until 1919. 
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portation (except for the street railways), and in agriculture. Now the 
power revolution has made its way into these other activities, as well as 
into manufacturing. Transportation uses more horsepower-hours and 
vastly more power equipment than does manufacturing. In both 
mining and transportation the power used per worker exceeds that in 
manufacturing. Even in agriculture the power equipment per worker 
—counting work animals—is greater than in manufactures, though 
because farm equipment has a low use factor, the ratio of power-hours 
per worker is less. Of the total horsepower-hours developed in the 
United States, hardly a third is yet supplied by the electric utilities. 
This indicates a great expansion for electricity in the future, but the 
significant thing for our purpose is not merely “superpower” or 
“electrification,”’ important as they are, but rather the total consump- 
tion of power in all forms, and the aggregate degree of replacement of 
human labor by power machines. 

Ideally, I suppose, what we want for an index of mechanical power 
is the total number of horsepower-hours generated by all forms of power 
equipment, no matter where installed or by what prime movers they 
may be driven. The attainment of such an ideal index is, of course, a 
long way in the future, but in the meantime something can be done by 
measuring the horsepower of the installed equipment. C. R. Daugh- 
erty, of the University of Pennsylvania, has just finished a notable 
paper which traces the total installed horsepower in the United States 
in all branches of activity, agriculture and transportation as well as 
mining and manufactures, at each census year from the Civil War to the 
present. Knowing the horsepower equipment and something of the 
average use factor and the amount of fuel consumed, it should be possi- 
ble to make some estimate of the total number of horsepower-hours 
developed in the country that will at least show the trend for the last 
few decades. 

Another approach to the problem of an index of power is to measure 
the amount of fuel consumed, or the fuel equivalent of other sources of 
energy. This is done in the series here presented. 

It will be said at once that such an index of raw energy consumed takes 
no account of improvements in the efficiency of converting fuel into 
mechanical power. That is very true, and it will shortly be pointed 
out how rapid improvements in combustion have affected the curve at 
certain points. But it is to be noted that improvements in fuel 
efficiency have been going on ever since Watt first took hold of the steam 
engine, and that the effects of a given improvement are spread over a 
period of years, so that the change from one year to the next is small. 
Its effect in the aggregate consumption of all energy is to alter the rate 
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of growth, not to cause an absolute reduction in the amount of raw 
energy consumed. 

It will also be said of such an index that it includes much fuel used 
directly for heat and not converted into mechanical power. The 
direct applications of heat, however, are no less important in the process 
of wealth production than is mechanical power itself. Heat and 
motion are twin forms of the same thing—energy—and either may be 
converted into the other. Their relation is illustrated by electricity 
which is generated from heat and then consumed at will, either to 
produce motion, or light (and heat), or heat alone, and industrial 
heating is already an important part of the business of the electric 
utilities. The contribution of direct heat is as indispensable as of any 
other form of energy. The essential thing is the country’s command of 
energy, and at present about half the annual energy budget—equivalent 
to 480,000,000 tons of coal—is consumed as heat. We commonly 
think of heat as used for warming buildings, but a greater amount goes 
for industrial purposes. The most ravenous consumer of energy is the 
iron and steel industry, and it requires many times as much energy in 
the form of heat as in the form of motion. The uses of motive power 
and direct heat necessarily expand together, for we cannot harness 
motion without the metals and we cannot use the metals without heat. 
The Industrial Revolution was no less a period of sudden advance in the 
art of applying heat than in the art of applying mechanical motion. 
We must think of the age of power, in a larger sense, as the age of 
energy. 

An ideal index of power as a factor in production should therefore 
include heat as well as mechanical power. The two are interchange- 
able, derived from the same sources, and to an increasing degree 
supplied by the same agencies. Yet they are sufficiently distinct so 
that the ideal index should consist of two series, one representing direct 
heat, the other mechanical motion, the two being finally combined into 
a composite index of energy. 

The present index is a step, but only a preliminary step, in the 
solution of this problem. It includes the raw energy consumed for all 
purposes, heat and light as well as mechanical power. 


CONSTITUTION OF THE ENERGY MATERIALS INDEX 


The heavy black line in the first chart is an index of the quantity 
of energy materials consumed in the United States in each year from 
1870 to 1926. The base is the year 1899. The index includes the min- 
eral fuels consumed, and the coal equivalent of the power developed by 
water wheels, by work animals, and by windmills. It includes the 
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charcoal used in blast furnaces, but not other forms of wood fuel. 
All of these fuels or fuel equivalents have been reduced to a common 
denominator in British thermal units. 

The figures for anthracite and bituminous coal are the domestic 
production, plus the imports minus the exports. Beginning with the 
war they have been corrected for changes in stocks. In recent years 
the production of coal as distinct from the consumption has been 
a poor business indicator, because of the effect of strikes and the ebb 
and flow of coal in and out of storage.!. In 1918, 28 million tons of 
bituminous coal were added to consumers’ stocks, and the following 
year an even greater quantity was withdrawn again, so that although 
the production of 1918 exceeded 1919 by 113 million tons, the consump- 
tion of 1918 was greater by only 49 million tons. With the inauguration 
of periodic surveys of consumers’ stocks of coal by the Federal Govern- 
ment it is now possible to determine the actual consumption with 
reasonable accuracy. This correction cannot be made prior to 1916, 
but there is reason to believe that in the earlier years the carry-over in 
stocks from one year to the next was fairly constant. The strikes of 
those days came with great regularity on April Ist of the even years. 
The stocks accumulated were smaller, and the deficit in production was 
usually made up within the same year in which the strike occurred. It 
is clear, however, that the great anthracite strike in 1902 and perhaps 
the bituminous strike of 1906 did distort the curve for those years and 
the years immediately following, and at this date no adequate basis of 
correction exists. 

The figures for petroleum, up to 1913, are the apparent consumption, 
calculated by adding to the domestic production the imports of crude 
and the imports of all refined products and deducting the exports of 
crude and of all refined products including paraffin (reduced to its 
equivalent in crude), and deducting also shipments to Alaska, Hawaii, 
and Porto Rico. Beginning with 1913 the petroleum figures are cor- 
rected for changes in pipe-line and tank-farm stocks of crude and be- 
ginning with 1917, for changes in refiners’ stocks of both crude and re- 
fined products. No correction for stocks of petroleum products in the 
hands of jobbers or ultimate consumers can be made. 

The figures for natural gas are the sales as reported by the Geo- 
logical Survey, and for the early years of the industry, when no record 
of sales was kept by many producers, they are based on the estimates 
of contemporary observers of the amount of coal displaced by gas. 

Waterpower is represented in the index by the equivalent in coal 


1 David L. Wing and F. G. Tryon, “ Fluctuations in Coal Production,” in The Problem of Business 
Forecasting, p. 202. 
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necessary to generate the same power. The actual production of 
waterpower in horsepower-hours has been collected by the Geological 
Survey from the public utilities since 1919. For earlier years and for 
the industrial plants generating their own waterpower, the horsepower- 
hours have been estimated from the installed capacity of water wheels 
as reported by the census, data for which are available back to 1879. 
The water horsepower-hours thus obtained have been converted into 
coal, and since waterpower is still a small factor in the total energy 
supply, a unit fuel consumption somewhat greater than that of the 
electric plants generating fuel power was assumed in order not to 
understate its importance. 

The figures of wind-power on farms are derived from C. D. Kins- 
man’s notable study, An Appraisal of Power Used on the Farms of the 
United States. 

The figures of animal power are of course the roughest of approxi- 
mations, but it is necessary to include them because they represent 
power quite as much as locomotives or tractors, and because if they 
are not included, a false idea of the rate of growth is introduced by the 
substitution of automotive for animal power. Fortunately a con- 
siderable basis of estimate was at hand. Kinsman’s study, based on 
much field observation, had estimated the horsepower-hours of animal 
power on farms in 1923. The census gave the number of work ani- 
mals on farms and not on farms for the census years, and the De- 
partment of Agriculture’s annual estimates gave the number on farms 
for the inter-census years. From these the probable number of horse- 
power-hours developed has been estimated, using the factors of 
strength per animal and hours worked per year accepted by the De- 
partment of Agriculture in Kinsman’s study. The horsepower-hours 
of animal power, thus developed, have been converted into equivalent 
fuel, assuming a very low thermal efficiency. 

In defense of these methods of estimate it should be said that they 
apply to but a small proportion of the total energy consumption, at 
least in the later years. In 1926, for example, waterpower contributed 
only 6.2 per cent, work animals 2.8 per cent, and windmills less than 
0.1 per cent of the grand total. A large error here would have little in- 
fluence on the final result. In the earlier years the possibility of error is 
more serious. 

COURSE OF THE INDEX, 1870-1926 


In Chart I the index, uncorrected for annual growth, is plotted on 
logarithmic scale alongside an unweighted index of production worked 
out by Dr. Carl Snyder. Snyder’s is the only general index of the 


1 United States Department of Agriculture, Bulletin 1348. 































7| An Index of Consumption of Fuels and Water Power 277 


CHART I 


UNADJUSTED INDEX OF ENERGY CONSUMPTION COMPARED WITH SNYDER’S 
UNWEIGHTED INDEX OF PRODUCTION, 1870-1926. BASE 1899=100.0 


ENERGY 
CONSUMPTION 


INDEX 
OF PRODUCTION 
UNWEIGHTED 


18702 
1875 
1905 
1910 
1915 
192 





@ 


physical volume of production which has been carried back to 1870. 
It includes a varying number of items, beginning with 49 in 1870 and 
increasing to 86 in 1910. The items are unweighted. As originally 
presented by Mr. Snyder before the American Economic Association 
in 1920,! and as it will appear in his forthcoming book, Business Cycles 
and Business Measurements, from which he has kindly permitted me to 
make use of it, the index was computed on a base of 1910-14 equals 100, 


1 American Economic Review, March, 1921, pp. 70-74. 
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which is here re-computed to base 1899 equals 100. The parallelism of 
the fluctuations in the two curves is fairly clear, even in the period be- 
fore 1890, and this is the more remarkable because Snyder’s index in- 
cludes a number of agricultural products, which constitute a rela- 
tively large part of the 49 items used in the beginning of his series. I 
might add that when the energy index is plotted against Stewart’s 
weighted index of the volume of manufactures for the period 1890 to 
1899! and against Day’s weighted indexes of manufactures and min- 
ing, as published in the Harvard Review of Economic Statistics, it shows 
the same harmony. 

It will be seen that the rate of growth of the energy index is faster 
than that of the production index, and of course much faster than the 
growth of population. Whereas the physical volume of production has 
been found to increase at the rate of something like 4 per cent a year, 
the consumption of energy over much of the period shown was com- 
pounding at the rate of from 5 to 7 per cent a year. 

It will also be seen that the rate of growth in energy consumption 
has altered its course during the period, as indicated by the changing 
slope of the curve. These changes are probably more apparent than 
real. The particularly steep slope in the period from 1870 to 1890 is 
partly due to the fact that the index does not include the energy of fire- 
wood, a relatively important source of heat and even power in the early 
days. At present firewood contributes only 6 per cent as much energy 
as the other materials, and the data on its use in the early years are too 
fragmentary to permit its inclusion in the index, except for the char- 
coal used in smelting iron. Enough is known, however, to suggest 
that were it included, the lower end of the curve would be raised and 
the early slope made less steep. The decline in relative importance of 
firewood was probably most rapid between 1870 and 1890. 

Again, beginning with the World War, a pronounced flattening of 
the trend is apparent. The high prices of fuel which began in 1916 
and the actual shortages of the war itself stimulated interest in fuel 
economy and greatly accelerated the tendency to get more work out of 
the same quantity of coal, which had been present, though in a less de- 
gree, from the beginning. The change may be dated from 1917; and 
the trend from that year to 1926, computed by the method of least 
squares, has been at the average rate of increase of 2 per cent a year. 
It is not expected that progress in fuel economy can continue indefi- 
nitely at its present rapid rate, and in time the growth of energy con- 
sumption may be expected to resume a course more nearly parallel with 
that before the war. 

1 American Economic Review, March, 1921, pp. 66, 68. 
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RELATIVE GROWTH OF ENERGY AND OTHER INDICES 


It is interesting to compare the growth of energy consumption with 
the growth of other measures of economic activity. Let us take the 
period from 1899 to 1916, a period when fuel economy was progressing 
at a moderate but relatively constant rate. In that period the index 
of energy shows an increase of 150 per cent. 

In the same period: 


Population. ..... ena .Increased 36 per cent 
Physical volume of agricultural ‘production (Aver erage 1914-18, 

Stewart) . Fatal btie Mn eae e a? * 
Physical oshaune of manniinsenes (Stewart) .. ee eee si a * 
Physical volume of mining (Stewart)..................... ” _—- * 
Railroad transportation (Stewart) . ous ere ere _ mn * 
Stewart’s combined index of all production (agriculture, 

manufactures, mining, and transportation) . aden: aise wi — 


The increase in energy consumption was thus four times as great as 
the increase in population, and nearly twice as great as the increase in 
the total volume of production. It was materially greater than te 
increase in manufactures. It was somewhat less than the increase in 
mining, but it must be remembered that the largest elements in min- 
ing are the energy materials coal and oil, while the energy index itself 
is held down by the inclusion of the vanishing horse. Finally the in- 
crease in energy was somewhat less than that in transportation. 

The broad relationships are unmistakable. A great increase in per 
capita production is made possible by a still greater increase in power, 
and along with the process goes an increase in transportation which is 
the greatest of all consumers of power. 


DEVIATIONS FROM THE TREND 


Can this index of energy be used as a measure of economic activity? 
It appears to have small value as a forecaster, but when corrected for 
the growth trend it ought to be a faithful measure of the extent of a 
boom or a depression. Chart II is an attempt to make such a correc- 
tion for the period 1899 to 1926. The change in direction already re- 
ferred to, which began in 1917 with the increasing attention to fuel 
economy, seems to warrant computing one trend for the years 1899 to 
1916 and another for 1917 to 1926. Both have been computed by the 
method of least squares. It must be emphasized that later experience 
may make advisable a different trend for the period since 1917. 

The chart compares the energy index thus adjusted, shown by the 
solid black line, with Day’s adjusted indices of the physical volume of 
manufactures and the physical volume of mining. Allowing for the 
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CHART II 


ADJUSTED INDEX OF ENERGY CONSUMPTION COMPARED WITH DAY’S ADJUSTED 
INDICES OF THE PHYSICAL VOLUME OF MANUFACTURES AND OF MINING, 
1899-1926 
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exaggerated vertical scale the fluctuations in the three curves show a 
decided resemblance. 

Close correspondence was to be expected between the curves of en- 
ergy (shown by the solid line) and of mineral production (heavy broken 
line), because both are heavily weighted by the mineral fuels. It will 
be seen that they do in fact travel in the same direction almost with- 
out a break. 

The correspondence between the energy curve and the curve of 
physical volume of manufactures (shown by the dotted line) is less 
close. A discordance is at once seen in the year 1902, when the energy 
index moves downward, while the index of manufactures rises sharply. 
This was the year of the great anthracite strike, settled by President 
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Roosevelt, and there was evidently both a diminution of fuel consump- 
tion and a depletion of stocks for which adequate correction cannot 
now be made, though the curve as plotted is adjusted slightly to allow 
for some change in the quantity of anthracite in the storage yards of 
the producers. In the same way the rise in the energy curve in 1903 
when the curve of manufactures is falling, probably represents heavier 
production to rebuild the depleted stocks. 

A bituminous strike in 1906 and forced activity to rebuild stocks in 
1907 may be part of the reason for another discordance in 1906-7. 
No correction for this strike is attempted. 

But even discounting those two periods, there is evidence that the 
adjusted energy index lags a little behind the adjusted index of manu- 
factures, as for instance in 1910, in 1913, in 1915, in 1917, and in 1922. 
In these years a change registered in the manufactures curve appears 
one year later in the energy curve. That this relation is a lag, and not 
a contradiction is shown by the fact that when the unadjusted figures 
for the two series are compared, the contradiction disappears and the 
direction of movement coincides. The absolute changes are in the 
same direction; it is the degree of change that varies, and only when 
the two series are adjusted for trend does the lag become prominent. 
Why there should be a lag, I am not prepared to say, but the data 
suggest that the full response to a given boom or depression is 
registered somewhat later in the total consumption of energy materials 
than it is in the volume of manufactures as measured by Professor 
Day. 

The highest point reached by the adjusted energy index since the 
war was 107 in 1923, and the lowest was 85 in 1921. The range be- 
tween high and low was thus 22 points. For the adjusted index of 
manufactures the range was 37 points. The comparison indicates that 
the real swing between boom and depression is less than is often sup- 
posed, and is less than would be indicated by manufactures and mining 
alone. 


THE ENERGY INDEX FOR 1926 


A preliminary calculation of the energy data for 1926 shows an in- 
dex of 310 on a base of 1899, as against 288 in 1925. The increase in 
energy consumption in 1926 over 1925 has thus amounted to 8 per cent. 
The adjusted index for 1926 would be about 109, or much the same as 
in 1923. 

1 The adjustment amounts to 1.1 in the index for 1902, and 1.1 in the index for 1903. Similar 


adjustments were introduced on account of the anthracite strikes of 1900 and 1912 as follows: For 1900 
and 1901, 0.4; for 1911 and 1912, 1.1. 
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TENTATIVE INDEX OF ENERGY CONSUMPTION 



















Energy in- Harvard index Energy Harvard index | Harvard index 
Year dex number volume of index adjusted | of volume of of volume 
unadjusted, manufactures | (trend=100.0)| manufactures of mining 
1899=100.0 unadjusted adjusted adjusted 
Se 100 100 98 98 101 
Re 106 101 96 93 97 
arte aplelps lah ag odd 112 97 
So diaiek db ieee 122 96 
Ds ke ned eee 124 103 
Gactbedawacn 122 7 
apie eire eee an 143 102 
i Miata wa tis 152 102 
eae eeone wee 151 112 
iw nekineeue 126 95 
155 100 
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Differential Equations and Population Estimates 


DIFFERENTIAL EQUATIONS SUBJECT TO ERROR, 
AND POPULATION ESTIMATES ! 


By Haroip Hore1uina, Food Research Institute, Stanford University 


1. Introduction. The objects of the present paper are: (1) to ad- 
vance certain general considerations affecting the use of differential 
equations in statistics; (2) to apply these considerations in the fitting 
of ‘‘logistic”’ curves, which occur in various economic, sociological, 
biological and chemical problems; (3) to show how to find the most 
probable value at any time, either by interpolation or by extrapolation, 
of a variable such as the population of the United States which may 
be assumed to have a tendency to proceed according to a differential 
equation, though subject to perturbations; and (4) to determine prob- 
able errors for such estimates. 

Relations between magnitude and rate of change having for example 
the form 


dx 
— =7(7 
7 f(z) 
or the more general form 
dx 
——=f(z,t 
7m f(z, t) 


occur frequently in the physical sciences and are now becoming im- 
portant also in statistics. Of these forms are the differential equations 
underlying the extensive work of Pearl and Reed on populations of 
men and fruit flies, and investigations of organic growth by numerous 
biologists. The business cycle may be studied by means of such equa- 
tions. Indeed the use of differential equations supplies the statistician 
with a powerful tool, replacing the purely empirical fitting of arbitrary 
curves by a reasonable resultant of general considerations with par- 
ticular data. 

But this growing statistical use of differential equations must 
inevitably face the fact that our a priori knowledge can never supply 
us with a definite relation between a variable and its rate of change, but 
only with a correlation. In astronomy, physics and chemistry nearly 
all correlations obtained in good work are very close either to zero or 
to perfection. Consequently by discarding the first set and identify- 


1 Presented before the American Mathematical Society, San Francisco Section, April 2, 1927. 
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ing the other with complete causation it has been possible and natural 
to avoid consideration of correlated but not rigidly connected variables. 
In this way it has come about that, for those fields in which large 
random fluctuations and merely correlated variables must be dealt 
with, the existing theory of differential equations is inadequate and 
needs to be supplemented by a new theory involving probability. Not 
only will single differential equations of the forms given above have to 
be dealt with, but more general systems of ordinary and partial 
differential equations are bound to appear. For example, there is 
reason to expect the diffusion of a species to proceed conformably to 
the equation for conduction of heat in a non-homogeneous medium, 
with density of population substituted for temperature. Considera- 
tion of the history of human migration, particularly with a glance at 
the beautifully colored maps showing density of population at each 
census of the United States published in the Séatistical Atlas of the 
twelfth census, strongly suggests that something like the flow of heat 
has occurred. Systems of ordinary and of partial differential equa- 
tions occur in economics. 

Much of the present paper is devoted, not directly to establishing 
such a general theory, but to the study in detail of certain problems 
associated with a particular case which has already received much 
attention. The population of the United States has been fitted by 
Pearl and Reed with the “logistic” equation 


‘ 1+me™ 
which is an integral of 
SD no~tp, ook) 
p dt 


this differential equation having a certain plausibility a priori. We 
shall inquire into the significance of an apparently close agreement 
between hypothesis and data. Next, on the assumption that there is 
an underlying tendency to follow the differential equation but that 
extraneous forces also affect the result, methods are developed for 
estimating population both in interpolation and in extrapolation. 
Finally, probable errors of these estimates are determined. These 
numerical results should be of value in statistical investigations re- 
quiring population estimates for other than census years. However, 
the main purpose is to develop a technique which may be paralleled in 
dealing with many other problems. Certain of the methods here set 
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forth are in fact being used by the author’s colleague, Dr. H. L. van de 
Sande Bakhuyzen, in biological investigations." 

Apart from population and the logistic are certain general questions 
which require attention. The first of these is: Should the fitting process 
and tests of agreement with data be applied to the integrated or to the 
differential equation? 

2. Differential equations seem to have been used in statistical inves- 
tigations only for the purpose of obtaining finite equations, which are 
then fitted to the data without reference to the differential equations 
from which they originated. This procedure, though valuable on ac- 
count. of its simplicity for preliminary investigations and work with 
inaccurate data, has the following shortcomings. 

(a) It is a commonplace that the more numerous the disposable 
constants the better will be the fit, whether or not a genuine causal 
relation exists. The process of integration introduces at least one addi- 
tional constant. Mcreover, the disposable constants usually appear in 
a more complicated fashion in the integrated than in the differential 
equation, so that application of tests of goodness of fit capable of 
interpretation in terms of probability becomes difficult or impossible. 

(b) Any criterion used in fitting the integrated equation must con- 
tain a large arbitrary element and may give decidedly misleading re- 
sults. The method of least squares, for example, presupposes that 
the differences between observed values and ordinates of the curve 
are entirely of the nature of errors of observation and vary according 
to the normal law. Actually they may be nothing of the sort. The 
observations may be perfectly accurate, but the forces acting, instead 
of being given exactly by the differential equation, may have other 
components. The usual fitting processes, when applied to accurate 
observations, replace truth by fiction. 

(c) Since a curve fitted to accurate observations gives incorrect 
values for the variable when this has been observed, we can scarcely 
expect it to give correct values for other occasions. To use for inter- 
polation a function giving wrong values at the ends of the range would 
be ridiculous. Similarly, the use of a formula which has always been 
wrong in the past will not give the most probable values for the future. 

These three objections are logically valid even in the presence of 
errors of observation, though on account of the diminished reliability 
of all results, the harm done is then less. Save in the extreme hypo- 
thetical case of all deviations from the fitted curve being due to errors of 
observation rather than to actual disturbances of the variable, inter- 


1“ Growth and Growth Formulas in Plants,’’ Science, Vol. 64 (1926) pp. 653-654. See also Bak- 
huyzen and Alsberg, ‘The Growth Curve in Annual Plants,”’ Physiological Reviews, Vol. 7 (1927) pp. 
151-187. 
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polation and extrapolation will be faulty, the method of fitting arbi- 
trary, and valid measurement of accord of hypothesis with observation 
difficult. Exactly the same objections hold good if our measurements 
are not directly upon the variable whose causal relations are given by 
the differential system in question but upon some associated variable 
which, apart from its correlation with the variable in which we are 
chiefly interested, executes random movements of its own. These 
fluctuations are indeed a species of error of observation. 

While the customary method presents us with smooth, attractive 
curves to describe what has happened in the past and under known 
conditions, there must always be considerable hesitation about pro- 
longing these curves into the future and the unknown. We have in- 
deed, like Patrick Henry, no lamp to light our footsteps in the future 
save the past; but it does not follow that our future path is to be found 
as an analytic prolongation of some curve drawn among our old foot- 
prints. Rather do we require an analysis of causes, a study of the 
tendencies manifested repeatedly in the past upon the repeated occur- 
rence of conditions which we term essential, and in spite of the varia- 
tion of other conditions which we consider non-essential. Such an 
analysis is not provided by the mere determination of a curve of some 
assumed type which conforms to the general course of past events. 

3. As an alternative to the exclusive treatment of the integrated 
equation, the differential equation may be used more directly. We 
may inquire whether in the past the rate of change of the variable has 
been what the hypothesis asserts. By the same process we determine 
the constants in the differential equation, leaving the selection of con- 
stants of integration to a later stage. This comparison of actual with 
hypothetical rates of change will also, in many cases, suggest desirable 
modifications of the hypothesis. 

Let us distinguish this for convenience as the differential method, 
and the usual mode of treatment as the integral. In any problem the 
fundamental working assumption of the differential method is, not 
that a certain differential equation holds at all times and everywhere, 
but that the most probable value of the derivative at any instant is 
that assigned by the differential equation. This broader assumption 
makes possible a more reasonable treatment of many problems by the 
differential than by the integral method. For example, a most prob- 
able interpolatory curve is determined which gives the corzect values 
at the ends of the range (cf. §/6). 

In dealing with causes which manifest themselves as tendencies amid 
fluctuations the differential method has marked logical advantages. 
Both in analysis of the known and in prediction of the unknown 
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it avoids a rigid transition from the differential equation, which states 
what happens in infinitesimal regions of space and time, to the asser- 
tion that a certain far-flung track is precisely followed for a long period. 
Such a transition is justified by the existence theorem for the differen- 
tial equation only if this equation holds with certain and exact truth. 
If it represents merely the most probable hypothesis at each point, 
and if the probabilities at different points are in any degree independent, 
we may not be able to assert even that the calculated path is highly 
probable. We can only say that it is the axis of a family of widening 
cornucopias, and that the wandering true path shows somewhat more 
preference for the inner than for the outer cornucopias. 

4. Sir Isaac Newton set a bad example for statisticians in his mode 
of establishing the relation which has been the admired model of 
scientific achievement for two centuries and a half. Were the solar 
system subject to a complicated set of unknown forces of as great an 
order of magnitude as the sun’s attraction—such a set, for example, 
as may exist in a nebula or near a multiple star—Newton could not 
have established gravitation by means of Kepler’s laws, which deal 
with an orbit as a whole. A statistical method would have been 
necessary; Newton would have been obliged to study the curvature of 
paths and the acceleration at various points by means of the second 
differences of the coordinates of the planets’ positions, and then to 
investigate the correlation between the acceleration, thus determined, 
of one body toward another and the distance between the two. 

A great historic method of scientific discovery has thus arisen from 
an astronomical accident. If only our tyrannical sun were smaller, 
the family of planets would enjoy some of the chaos of democratic 
societies, and the astronomer would be closer to the statistician. Sci- 
ence would have arisen later and statistics earlier. Those astronomers 
who still feel a suspicion of quackery about statistical methods, par- 
ticularly correlation, may reflect on how narrowly their own science 
missed having to wait for these very methods before emerging from the 
embryonic stage. 

5. A feature of Newton’s law of gravitation more suitable for emula- 
tion by statisticians than its mode of discovery is the determination of 
the constants. Of the various constants appearing in the integrated 
equations of motion, not all are of equal importance, and not all are de- 
termined finally from the same data. The constants of integration 
which determine the eccentricity, size and position of the orbit and the 
times at which the planet passes perihelion are of distinctly less inter- 
est than the constants which appear in the differential equation. Of 
the three latter, the masses of the two bodies are of small importance 
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compared with the value of the universal constant of gravitation. In 
general the constants in a differential equation expressing a physical 
law have a different status from constants of integration, which may 
change as a result of perturbations. 

6. The leading difficulty in the application of the differential method 
is the determination from empirical data of values of the derivative in 
question sufficiently numerous and accurate to be statistically signifi- 
cant. Such quantities as rate of growth of population cannot be 
measured by a speedometer but must be estimated by means of finite 
differences. The longest series of population data, that of Sweden, com- 
prises only eighteen censuses, and the number of cases available for 
correlation must be further curtailed by the fact that proper estimates 
of rates of change cannot be made for the first and last census dates. 

In this connection there is a further difficulty which may possibly 
be overcome by continued mathematical investigation. The sampling 
distribution of correlation coefficients between two sets of numbers 
of which one is obtained by manipulation of the other is not the same 
as if the numbers were two sets of observations. Hence we cannot at 
present assign definite probable errors to correlations obtained by the 
differential method for testing agreement of hypothesis with fact. The 
difficulty is related to that of estimating the significance of correlations 
drawn from time series in economics. Some light is thrown on these 
questions in a forthcoming paper.’ 

On account of these difficulties it will be well to use considerable 
care to procure the greatest accuracy possible in numerical evaluation 
of derivatives. This evaluation will be discussed and exemplified in 
§§ 70-13 for population growth curves. 

7. Closely related to the differential method is the growing practice 
among economic statisticians of using successive differences of time 
series in correlation problems rather than the original series. There 
is also a connection with the method of graduation proposed by Pro- 
fessor E. T. Whittaker which has attracted much favorable attention.’ 
By this method values as nearly constant as a given degree of accord 
with the data will permit are assigned to the differences of a certain 
order. 

Reference may also be made to a presidential address of G. Udny 
Yule * and to a work of the distinguished French mathematician and 


1 Hotelling, “‘An Application of Analysis Situs to Statistics,”’ Bulletin of the American Mathematical 
Society, Vol. 33 (1927) pp. 467-476. 

2 Proceedings of the Edinburgh Mathematical Society, Vol. 41 (1923) pp. 63-75. Cf. the articles by 
R. Henderson in the Transactions of the Actuarial Society of America, Vol. 25 (1924) pp. 29 ff. and Vol. 
26 (1925) pp. 52-57. 

3‘*Why do we sometimes get nonsense correlations between time series?’’ Journal of the Royal 


Statistical Society, Vol. 89 (1926) pp. 1-69. 
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minister of marine, Emile Borel. In a note “Sur l’emploi de la méthode 
differentielle pour la comparaison des statistiques’? appended to the 
1924 edition of his Eléments de la theorie des probabilités, M. Borel dis- 
cusses the advantages of using first differences of time series rather 
than the series themselves. Using the yearly ratios of male births in 
the German Empire for thirty years he reaches the interesting conclu- 
sion that the year-to-year changes of these ratios, and not the ratios 
themselves, are independent random variables. 

The recent paper of Sir G. H. Knibbs! is concerned with causes 
affecting rates of increase, and so may be said to use the differential 
method. 

8. The Cycle. Much attention has been fixed upon the “business 
cycle.” A rhythmical contraction and expansion of the economic 
system as a whole seem to exist independently of seasonal variation 
and numerous incidental fluctuations, which are considered to be 
superimposed upon the fundamental swing. 

Theories of the business cycle fall into two classes, considering 
respectively what are called in mechanics free and forced oscillations. 
Forced-oscillation theories require some regularly recurring cosmic 
cause which influences the economic system but is not influenced by it. 
The best known theorist in this field is Henry Ludwell Moore,? who 
has suggested some effect of the planet Venus as an explanation of the 
ups and downs of prices and production. Others discern a correlation 
of weather, unemployment and revolutions with sunspots. The 
trouble with all such theories is the tenuousness, in the light of physics, 
of the long chain of causation which they are forced to postulate. Even 
if a statistical test should yield a very high correlation, the odds thus 
established in favor of such an hypothesis would have to be heavily 
discounted on account of its strong a priori improbability. 

Free oscillations are those which result from shifting internal stresses, 
and do not require the periodic application of an outside force. Most 
recent writers on economic cycles are thinking of vibrations of this type. 
High prices of pork and low prices of corn cause overbreeding of hogs 
and underplanting of corn, which lead to cheap pork and dear corn, so 
that breeding decreases and corn acreage increases, with the result that 
high pork and cheap corn return, followed by overbreeding and under- 
planting, and so on again and again. Oscillations due to monetary 
conditions are of the free variety.’ 


1 Journal of the American Statistical Association, Vol. 22 (1926) pp. 381-398 and Vol. 23 (1927) pp. 
49-59. 

? Professor Moore's extensive labors on this subject are largely embodied in his book, Generating 
Economic Cycles (New York, Macmillan, 1923). A criticism by Mark H. Ingraham appears in the 
Journal of the American Statistical Association, Vol. 18 (1923) pp. 759-765. 

3 Cf. Irving Fisher, The Purchasing Power of Money (Macmillan, 1913) Chapter 4. 
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What do recent writers mean by allusions to a business cycle of 
varying length, allusions which read somewhat like a contradiction in 
terms? Plainly they refer to a tendency to free oscillation, but a 
tendency which never comes to fruition in a regularly undulating curve 
because of continual deflections. Like a weight suspended from a 
spring, an index of the business cycle moves up and down; but, as when 
the spring is in the hands of a small boy, one can never be quite sure 
what is going to happen next. 

If a statistical test for forced oscillations is to be made, the method of 
periodogram analysis is entirely proper. This method presupposes 
periods of absolutely constant length, and has proved its worth in 
studies of binary stars. But harmonic analysis and the periodogram 
are not suited either to detect or to use in prediction any tendency to 
free vibration which is subject to serious disturbance. To detect 
vibratory tendencies in a time series we must study the correlation of 
short-term changes of the variable with the magnitude of the variable. 

The simplest differential equation whose solution is periodic but 
which does not itself contain a periodic function is 


d’x 


— =—k*z, 

d? 
the physical interpretation of which is that the acceleration at any 
time is negatively proportional to the displacement at that time. The 
solution may be written x=A cos k(t—e), representing simple harmonic 
motion with period 27/k. 

If now we should find between a long time series and its second 
differences ' a sufficiently marked negative correlation, we should be 
justified in ascribing to the series a tendency to periodicity. The 
regression equation 


would then imply a tendency of z to oscillate with period 27, Vb 
about a mean value a. A graphical interpretation, if x be plotted as 
ordinate against time, is that above a certain line x=a the curve is 
usually, though not invariably, concave downward, while below this 
line it is usually concave upward. This interpretation does not require 
linearity of regression. 

An objection to this procedure might possibly be made on the ground 
that if our statistical series were replaced by numbers drawn out of a 
hat we should expect to find for the correlation of the series with its 


1 Corrected if necessary to give a fair estimate of the second derivatives. 
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second differences a very high value. Indeed, the graph has the prop- 
erty mentioned in the last paragraph of apparent periodicity. Num- 
bers drawn at random independently will yield a correlation which, as 
the length of the series increases, approaches —2/ V6 =—.816. 

The answer is that the numbers in a statistical series are not drawn 
independently at random. The very fact that they constitute a series 
signifies that there is some relation between successive members. If 
their first differences are random the spurious element in the correlation 
is trivial, and if the second differences are random it disappears. 

The conclusion seems inescapable that the relative importance of 
free oscillations and mere random wiggles is fairly measured by the 
coefficient of correlation between a series and its second differences, 
and that the period may be determined from the regression equation. 

9. Population and the Logistic. The simple assumption that the 
rate of increase of population will remain constant forever leads, as 
Malthus pointed out, to absurdities. Somewhere there must be a 
saturation level above which population cannot rise. It is natural 
therefore to ask what can be done with the assumption that the relative 
rate of increase is a function of the population which approaches zero 
as the population p approaches its extreme value ¢. The simplest 
case is 

1 dp 


— —=a—bp=)(c—p), 9.1 
~~ p =b(o—p) (9.1) 
where a=be. The integrated form of this equation,' 
Co 
=———_—_.,, (9.2 
. 1+me™ 


where m is the constant of integration, gives the logistic curve of 
Verhulst. Raymond Pearl and Lowell J. Reed, in a numerous series of 
publications ? beginning in 1920, have fitted logistic curves to various 
populations of humans and of imprisoned fruit flies. One of their most 
interesting conclusions is that in the United States the absolute rate of 
increase dp/dt ceased to increase and began to decrease in April, 1914, 
and that a population of 197,270,000 will be approached asymptoti- 
cally. They vary the equation (9.2) for certain populations by replacing 
the exponent —at by a polynomial in ¢ with coefficients to be deter- 
mined from the data. They also add a constant term to the right side 
of the equation, apparently with no justification except the better fit 
thus obtainable. 


1 A less known but for numerical calculation more useful form of this equation is 
p=to[i—tanh ja(t—w)). 
* See Pearl, Studies in Human Biology (Baltimore, 1924) Chapters 24 and 25, and the references there 
given. 
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Logistic curves have been used extensively to describe the growth 
rates of animals and plants. Since the curve also represents auto- 
catalytic chemical reactions it has been suggested that growth is con- 
trolled by such a reaction. This application of the logistic has been 
criticised by Bakhuyzen and Alsberg in the papers cited in §/. Re- 
cently the logistic has been used by H. J. Ettlinger to study the learning 
process.! 

No one can look at the charts displayed by Pearl and Reed without 
admitting that the dots representing observations are very close to the 
curves. But with three or more disposable constants and a rather 
small number of observations, even a close fit must always face the 
question of significance. If we wish merely to give a summary of the 
past fourteen censuses of the United States, the equation (9.2), with a 
certain set of values for m, a and, is useful. If, however, we wish to 
predict the population a century hence we must take account primarily 
of underlying causes. 

Some of the causes of population change are not direct consequences 
of the size of the population. Wars and epidemics take their toll, 
sanitation and knowledge spread, legal restrictions are placed upon 
immigration. It is sometimes held that such influences are negligible 
as disturbances of the great movement of a population figure along its 
proper logistic. It may jump the track when a world war is followed 
by an influenza epidemic, but is supposed immediately to jump back 
onto the track. 

From our point of view, however, it may be possible to place con- 
siderable faith in the differential equation (9.1) without accepting as a 
valid description of the past and future course of population one and the 
same logistic. If a catastrophe throws population off its track we 
should then expect it to go along a different logistic track. The new 
track and the old will satisfy the same differential equation (9.1) pro- 
vided conditions have not changed so greatly as to alter the values of 
a and b; but the constant of integration m will be different. 

In using the logistic or any similar hypothesis to make population 
estimates the first step should be to express : as accurately as 

Pp 


possible in terms of p. The degree of confidence to be placed in our 
population estimates will depend upon the correlation (linear or 


‘ie —- 3@ ‘ ; 
curvilinear) which — 4 has in the past shown with p and upon our 
Pp 
faith in the permanence of the relation. 
The manner of determining the constant of integration, which in this 
1 American Mathematical Monthly, Vol. 33 (December, 1926) pp. 506-510. 








23) Differential Equations and Population Estimates 293 


case merely fixes the position in time of a movement whose shape and 
size are determined by the differential equation, will depend on our 
purpose. If we are only trying to give a picture of the general history 
of the population, we slip our rigid curve back and forth on the diagram 
until by some criterion such as that of least squares it passes among the 
dots as satisfactorily as possible. But if our purpose is to forecast the 
size of the population in future years, the constant of integration will 
be determined by just one census, the latest. To give as the most 
probable value of the future population of the United States a con- 
tinuous function which for the latest census year takes on a value 
different from that of the known population is ridiculous. Likewise for 
intercensal interpolation it is better to use a function which takes on 
the correct values at census dates than a curve which sweeps smoothly 
through the centuries, above some of the dots representing our knowl- 
edge and below others. Indeed we may, in assigning a most probable 
value to the population at any time, determine a different analytic 
function for each period between successive censuses. Interpolation 
raises some novel questions which will be dealt with after we have con- 
sidered means of selecting the particular differential equation which is 
to give the most probable values of 2 . 
Pp 
This we shall do by plotting against the population p of a country 


for each census the estimated value of =P. If a straight line fits well 


p 
the resulting points, its equation 


in which a and b may well be determined as usual by least squares, 
defines, except for position in time, a logistic representing the past 
growth of the population.! Any systematic deviations from linearity 
which may appear will indicate improvements upon the logistic 


hypothesis. 
Population estimates may be improved by considering what is 
known of births, deaths and migration and by making historical studies 


1 This method of fitting a logistic is somewhat similar to one proposed by G. Udny Yule in one of his 
presidential addresses to the Royal Statisiical Society (Journal of the Royal Statistical Society, Vol. 88, 
1925, p. 1). The present paper gives a method of dealing with non-uniform census intervals, which 
Mr. Yule does not do. He does not think well of his own method because the points on his diagram do 
not lie very close to a straight line and because the portion of the straight line which is involved in 
long-range forecasting extends far from the vicinity of the dots. But these would really seem to be 
advantages because they make clear the shakiness of the extrapolation. In considering a prediction 
it is safer to have before one’s eyes the fluctuating growth rates of the past than the deceptively smooth 
curve resulting from their cumulation. 
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on which to base corrections for census errors. These refinements 
would carry us too far from our main purpose, which is to illustrate the 
method and discuss the logic of differential relations in statistics. 
Methods of allowing for additional knowledge can be found; we shall 
follow the example of previous curve-fitters in ignoring immigration and 
vital statistics and in supposing the censuses accurate. 

10. Determination of Growth Rates. We have then the magnitudes 


P1, D2, - - - » Pn Which a population or other variabie takes at the 
respective times #1, f, . . . , tn, and wish to estimate the values of 

1 dp _dlogp 

p dt dt — 


It is possible to do this by means of the formulas for derivatives in 
terms of differences obtained from the interpolation formulas of New- 
ton, Stirling, Bessel and Everett.!_ The laborious use of divided differ- 
ences will be necessary since censuses, alas, are not taken with the 
uniform ten-year intervals often assumed. And with regard to logic 
as well as to labor the use of these arbitrary formulas seems less 
desirable than the following method, which utilizes our knowledge of 
the general nature of growth. 

If the expression giving the relative growth rate in terms of p be 
expanded about any particular value of p and terms of order higher 
than the first be dropped, the result may be written in the form (9.1), 
page 291. This procedure is valid so long as the range in which it is 
applied is short. It amounts to replacing a small arc of a curve by 2 
chord. Consequently the logistic equation (9.2) may be assumed to 
hold in short intervals for the purpose of determining the growth rates; 
and this involves no fallacy even if we are examining whether the 
logistic holds in long intervals. The process is analogous to comparing 
the directions of a number of short chords of an unknown curve to see 
whether it is a straight line. 

In fact our estimate of d(log p)/dt at any census date will be the 
value of this derivative calculated from the logistic equation (9.2) in 
which the three constants are determined by the three successive 
censuses of which the middle one is on the date in question. Thus in 
effect we fit a succession of overlapping logistic arcs and then look to 
see how nearly they combine into one curve. 

Substituting three consecutive pairs of values of p and ¢ in (9.2) we 
obtain 

o=(1+me-™") p= (1-+me-™) p2 = (1+-me™) ps; (10.1) 


1 Handbook of Mathematical Statistics (Cambridge, Massachusetts, 1924) pp. 50-52. 
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whence 
P2o— Pi P3— P2 
m=— =— 10.2 
poe" — pie pse~*"* — pre ™ aes 
and 
pi(ps— p2)e~" + po(pi— ps)" + ps(p2— pie". (10.3) 


After determining a from (10.3) we may find m from (10.2); and then 
from (9.2) we have 
ldp__ ame 


—_———.. 10.4 
pdt 1+me™ ties 


If we substitute the second member of (10.2) for m and put c=t—t:, 
(10.4) takes the form, for t=, 


(2 2p) 2 (10.5) 
p dt/t2 — p,(e*-1) 





Now make the substitutions 





t—h=c, —t=c(1+y), e* =z, PilPs— Pr) _ (10.6) 
Ps(P2— pr) 
The growth rate will be found from 
(2 4 _Pr—m = log z (10.7) 
pdt/2 cp «2-1 
when z has been calculated from the equation 
2ty—(r+1)2+r=0 (10.8) 


obtained by manipulation of (10.3) and substitution from (10.6). Let 
us investigate this peculiar equation. 

11. Solution of the Transcendental Equation. For any value of 
7, x=1 is a solution of (10.8), but a solution to be rejected because, 
by (10.6), it would imply a=0. If the observations are equally spaced 
in time, y=0 and the other root of (10.8) is z=r. 

For unequally spaced observations we must find x by approximation. 
Now in dealing with complicated transcendental functions it is dan- 
gerous to assume offhand the properties of continuity and local monot- 
ony necessary for the successful convergence of approximation proc- 
esses. Fallacies into which one might easily fall in this way could be 
cited. Hence we shall do well to examine the curve K in the xy-plane 
defined by (10.8). Incidentally, the method of doing this may be of 
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intrinsic interest, involving as it does a kind of geometrical reasoning 
analogous to mathematical induction. Solving (10.8) for y we have 


log [(r+1)z—7] 
y= - 
log x 





2, (11.1) 


which shows that 7 is a single-valued continuous function of 2, pro- 


r 


vided x> >0. It may be remarked that for an increasing 





ry 

population the value of r is necessarily positive and is ordinarily 
between 3 and 1, while that of z is necessarily between 0 and 1. Also 
since t;>t.>t, it follows from (10.6) that y>—1. For values of y 
greater than —1 we shall prove that z is a continuous, monotonic 
decreasing function of y. 

From (11.1) it is evident that, as y approaches +, z must approach 
r/(r+1). When zx approaches 1, y approaches r—1. 

By means of the rule for differentiating implicit functions we find 

from (10.8) 


2+Y 
dz bg 2 (11.2) 


dy r+1—(2+y)a'+7 
For z>0 and y>-—1 this expression is everywhere finite, continuous 
and single-valued except along the curve, which we shall call C, 
1 


on(ctt\™, (11.3) 
2+ 








where it becomes infinite. 
Let us disgress a moment to study C. From (11.3) we have 
(a (11.4) 
x dy (1+~)? 2+y (1+y7)(2+7) 
The graph is shownin FigureI. The features essential to the argument 
are: 
(a) C has the point y=r—1, z=1 in common with K, and has no 
other finite point in common with z=1. 
(b) At this point the slope of C, found from (11.4), is 
—1 
r(r+1) 
(c) For y>—1 C has at every point, as shown by (11.4), a tangent 
which is not vertical. For x>1, y<r—1, and the slope of C is always 
negative. 








(11.5) 
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*In the first vertical line y= —1. 

(d) The region 0<2<1, y>-—1 of the plane is divided by C into 
two regions within each of which the sign of (11.2) is constant. 

Turning to K we find from (11.1), 


dx _ z{(r+1)x—r] (log x)? 


dy (r+1)z log x—[(r+1)z—r] log [((r+1)a—r]’ 





which for x=1 takes on the indeterminate form 0/0. Differentiating, 
therefore, each term of the fraction we have, after a little manipulation, 








lim dex _ lim 2{(r-+1)2—r]—[2¢+1)2z—r] logz_ -2 yg 
zldy «z-1 +041" Wet peat r(r+1) 
on K log x 


From the point y=r—1, r=1, K descends more rapidly than C 
(compare (11.5) and (11.6)). Entering thus a region in which its 
slope is negative it must forever afterward have a negative slope. 
For, on account of (c), it cannot meet C without first acquiring a 
positive slope; and it cannot, on account of (d), acquire a positive slope 
until after crossing C. Hence z as defined by (10.8) is a monotonic 
decreasing function of y. Similarly, a point moving to the left on K 
from its intersection with C enters a region in which the slope of K is 
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negative, and cannot escape from this region without crossing C. But 
this is impossible because the slope of K would then, by (11.2), have 
to be infinite and that of C therefore positive, contrary to (c). This 
completes the proof of the continuous monotonic character of z as a 
function of y. 

To calculate x we use (11.1), substituting assumed values of z in the 
right member. If the expression proves to be greater than y, we take 
a greater value for 2; if smaller, a smaller value. We shall do well in 


, : a d?x 
this process, since we know no limit on 72 to safeguard our results 
és 


by finding values of x close together which give values of y both 
greater and less than the known value. By the property of monotony 
which has been proved we are then sure that the true value of z is 
between these limits. 

A convenient beginning for the approximation process when the 
observations are at nearly equal intervals and y is therefore small is 
made if we observe, from (11.2), that for y=0 and x=r 


dx/dy=r* log r/(1—r); 


whence 
z=r+yr log r/(1—r), 


approximately. On account of the considerations of the last para- 

graph we take a value of x somewhat further removed from r than this. 

For subsequent approximations to z interpolation is used. When z 
has been determined, the growth rate is found from (10.7). 

12. Two quantities which might be expected to differ little from the 

relative growth rate at time ¢: are 

q= Pi? and s=2— Pt. 

CPs Cp 


It is noteworthy that for y=0 the growth rate as determined by 
fitting a logistic to three successive censuses lies between g ands. For 
by (10.6), r=q/s; whence, from (10.7) with z=r, 


(2 $e) = qsl98 q—log s 
p dt/t, q-s 
and by Rolle’s:theorem, 
log gq—log s 
q—s 
must lie between 1/g:and 1/s. 
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These limits on the calculated growth rate have been found useful 
as a rough check on calculations. They are, however, especially 
valuable in showing the smallness of the difference between the true 
growth rate, of which they are independent though crude estimates, and 
that found by the logistic method. 

13. For continental United States the growth rate calculated in this 
way is given in Table I for each of the censuses except the first and last, 
in which cases it is of course impossible. An estimate of the growth 
rate in 1920, made by averaging birth, death and immigration rates for 
three years including the census date, is however added. The same 
data are shown in Figure II. The unit of time used is the year. 


FIGURE II 
POPULATION AND RATE OF INCREASE, CONTINENTAL UNITED STATES 
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TABLE I 
POPULATION AND RATE OF INCREASE, CONTINENTAL UNITED STATES 


° 











: Rate of increase per 

Population 1 dp 

Date (millions) thousand, 1000 . — — 

p dt 
CR ne a ee ee ee eee 3.929 vorr 
NE cs ie EE ch We ate ek eee 5.308 30.05 
ERIE ERS eee ier GRE nL 7.240 29.90 
DT 22. Adautachksbupndpeananiwd 9.638 28.98 
June i es ee beak cee ad 12. 866 28.90 
June i core Lee nade e tae tie 17.069 29.31 
June SE os sd sas ere gents par are ies oars eae 23.192 30.55 
June oo ae cea is nse aed areal 31.443 25.17 
June ks coe eae eal wen aienig Siaeieinie 38 . 558 22.88 
June ie a a A ee re a aad 50.156 24.56 
June iad i pi eee ae Gace ae wR RAO 62.948 20.79 
June SS ER eee reer eee 75.995 19.07 
ERG EH sae sae 91.972 16.70 

NIT Se ae bees ce ee ao 
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FIGURE III 
POPULATION AND RATE OF INCREASE, ENGLAND AND WALES 
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Similar data for England and Wales are presented in Table II and 
Figure III. 


TABLE II 
POPULATION AND RATE OF INCREASE, ENGLAND AND WALES 











. Rate of increase per 

Population i dé 

Date (millions) thousand, 1000 . : , 

ks cece aiw awe cee mes Ra 8.893 eo 

ae a ae de a Rc wee sank eke wee eid 10.164 14.64 
ohn Shae cl ee bene aenewne 12.000 15.64 
6 oot i ea eka bs kw we kaw we wees 13.897 14.11 
June I ios Ga) ai hs Se eee aaah es arg aly ROree ie 15.914 12.83 
March 31, 1851......... De Seatac les oe Wain kee eel 17.928 11.70 
April ii oe a ah es hd a careers a ew at ied 20.066 11.81 
April Di ccktchedtsepe cate iaaecedaenan 22.712 12.87 
April Ds P- thonerenceakaweveeeskauaeuded 25.974 12.19 
April ad at cha thia a nce ho turs 6 Walle ae ee 60 date 29.003 11.24 
April ES a ag: aah tata woe Gwe we ae ea wee ae 32.528 10.91 
April a‘ oibugne outa s sae ean ew eee 36.070 6.97 
PY WS ods. bdd ban adennednseadavaseeden 37 . 885 (8.37) 











Expressing the growth rate linearly in terms of the population by 
least squares and using all the data of Table I we have 


1000 - $ aP _ 31 489—.16938 Dp. 
p dt 
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However, the growth rate for 1920 was obtained in a different manner 
from the others, and is affected by the anomalous post-war situation. 
Omitting this case we have, instead of the above equation, 


1000 - . “P = 31.236 —.15949 P, (13.1) 
P 


which is plotted in Figure II. 

The asymptotic population is found by equating the growth rate to 
zero. For the first of the two equations it is 185.87 million; for the 
second, 195.87. Pearl and Reed give 197.27 million for the upper 
asymptote of their logistic, which is fitted by means of the censuses of 
1790, 1850 and 1910. 

For England and Wales we find for the data of Table II, other than 
the last line, the equation 


1000 - 2 2? ~17.320—.23539 p (13.2) 


p at 


indicating an ultimate population of 73.58 million. The graph is 
Figure III. Pearl and Reed, using a more complicated curve, give an 
upper limit of 73.043 million. 

The correlations are strikingly high: —.974 for the United States 
if 1920 is included, —.963 without this case; for England and Wales 
the correlation is —.906 with the last data, —.880 without them. If 
the result of the defective United States census of 1870 had been re- 
placed by a slightly higher figure the extraordinarily large correlation 
coefficient would have been even greater. 

Of course the inference that these populations are bound to continue 
along their respective logistic paths does not have the conclusiveness 
which would go with such high correlation coefficients in many statis- 
tical arguments. Even if the probable errors as calculated on some 
conventional scheme should prove to be very small, considerable 
skepticism would be justified as to forecasts extending even a century 
into the future. In our age evolution is rapid, and analogies from past 
to future correspondingly weak. Moreover, we have at the utmost 
only thirteen observations for calculating a correlation coefficient. 
The usual methods of determining probable errors are not applicable 
to such small samples, and are doubly weak because the observations 
are ordered in time. What is needed is a measure of the probability, 
upon some definite alternative hypothesis, of obtaining by chance 
such high correlations. The chief difficulty here is to frame a definite 
hypothesis to compete with that from which the logistic arises. Such 
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an hypothesis, which should have some justification in general con- 
siderations apart from the data in hand, could be used to obtain a 
curve of distribution of correlations of the type we have calculated. 
In this connection considerable mathematical difficulties might be 
encountered ; but sufficiently good numerical results could undoubtedly 
be obtained by means of experiments with cards and dice, conducted 
and interpreted in the light of general considerations such as those 
discussed in the paper cited in §6. 

Since the populations for which we have census counts have all 
been increasing throughout the period of the record, our data cannot 
tell us for certain whether the cause of the lowered growth rates is to 
be found in the increased density of population or in some other factor 
associated with the passage of time. Such factors are surely not 
lacking. Birth control, the economic, legal and social emancipation 
of women, the loosing of many old superstitions including the belief 
in “be fruitful and multiply” as a categorical imperative, the transi- 
tion from farm to urban life, higher standards of comfort and education, 
immigration restriction—these and many more might be cited. There 
is of course the question of the degree to which these factors are inde- 
pendent of density of population. It may be alleged that they are all 
consequences of increased population and are therefore indirect re- 
sults of a pressing against the means of subsistence. This idea cannot 
be proved or disproved statistically; it certainly makes no strong ap- 
peal to common sense. 

The partial correlation of population with growth rate, eliminating 
time in the customary linear manner, has for the twelve United States 
cases used the high value —.746. The probability that a value so 
far removed from zero would be obtained by random sampling from 
an uncorrelated aggregate is a little less than .01, and so may be con- 
sidered significant. 

If we had longer series of census data it might be worth while to 
examine the degree of conformity of population growth to the differ- 


ential equation 

1d 

— 2 = a—bp—ct, 

p at 

which reduces to the logistic for c=0, but which also allows for the 
secular trend in the growth rate. The integral curve 
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will fit the data better than the logistic, since there is an additional 
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disposable constant, and the a priori probability is at least as great. 
But the forecast supplied by the new equation will be markedly differ- 
ent from that of the logistic. It will indicate that the population of 
the United States, for example, is to increase slowly for a few decades 
to a definite maximum, after which there will be a decrease. This is 
indeed not unlikely. If we follow the curve enough further, we shall 
approach zero population, and this will serve as an example of the 
dangers of extrapolation. A less disagreeable outcome may be assured 
by various modifications of the differential equation which will in no 
wise diminish the goodness of fit of the integrated equation to the data. 

It may be noted that a, the constant term in the logistic differential 
equation, is nearly twice as great for the United States as for England 
and Wales, though a believer in purely biological determination of 
population growth would expect these two quantities to be practically 
identical. 

14. Doubts as to its universal validity to the contrary notwith- 
standing, an important place must be given the logistic hypothesis. 
The practical importance of population questions impels us to fore- 
cast and to interpolate; and in addition to fitting well the known data 
the logistic has the enormous merit of simplicity. As Bertrand Rus- 
sell has somewhere said, it is often more important that a scientific 
hypothesis be simple than that it be true. 

For a particular problem the logistic hypothesis may be defined by 
some such assumption as the following, which we make the basis of 
further work: 


The most probable value of the rate of increase 3 of the population 
p 


of continental United States at any moment is that given by the regression 
equation (13.1). 

We make also the further assumption, not unlike many to be found 
in statistical theory: 

The probability of deviations of the actual rates of increase from those 
given by the regression equation is given by the normal law of errors; and 
such probabilities for successive elements of time are independent. 

This assumption will be used for intercensal interpolation and also 
to provide probable errors for interpolated and extrapolated values 
of the population. The assumption of independence of deviations is 
open to some question, since a special influence acting in one year is 
likely to make itself felt also in the next. For this reason the uncer- 
tainty of the estimates is somewhat greater than the probable errors 
indicate. This difficulty is unavoidable, and is inherent in a great 
deal of statistical work, including all that deals with time series. It 
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is noteworthy in this connection that probability, regarded as degree 
of rational belief, generally deviates on the side of uncertainty from 
the numerical probabilities associated with the ‘‘ probable errors”’ cal- 
culated on the usual assumptions, which include the substitution of 
certain characteristics of the known sample for those of the unknown 
population. 

15. Perturbations. For purposes of interpolating between observed 
values of a quantity varying according to a differential equation subject 
to error, and also to investigate the probable errors of extrapolation 
and interpolation, we now develop a theory of successive independent 
deviations. This may be regarded as a generalization of the classi- 
cal theory of errors; it is closely related also to the idea of random 
migration by short leaps which has been the subject of various works 
by Pearson and others. 

Consider, for example, population growth. How shall we calculate 
probable errors for forecasts made by means of the logistic? Let us 
follow ordinary practice and neglect for simplicity errors in the fore- 
casts resulting from errors in the assumed values of a and b. We then 
have to deal with a sort of random migration, up or down, superim- 
posed upon the central tendency to proceed logistically. Apart from 
the central tendency, there are numerous casual influences, operating 
for brief periods, which accelerate or retard the rate of growth. For 
the same reasons that justify the employment of the normal law of 
error in much statistical work, it is reasonable to assume that these 
casual disturbances of the growth rate are normally distributed. We 
may extend this assumption to take the mean value as zero, since sys- 
tematic movements have been taken account of in eliminating the 
effect of the general logistic growth. 

The assumption of independence for the random disturbances in 
successive intervals of time implies that the variance (7.e., squared 
standard deviation) of the total disturbance for a period equals the 
sum of the variances for the constituent intervals obtained by subdi- 
vision of the period in any manner. Consequently, variance is pro- 


portional to time. Our assumptions thus lead to 
(9-3)? 
__1|__ ait dg (15.1) 


kV QrAt 
as the probability that the growth ratio Ap/p in the short interval 
At will lie between g and g+dg. Here 7 is the theoretical value: 
g=(a—bp)dt, 
and k?, which may be termed the specific variance, is to be determined 
from the data. 
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The quantities g—g are disturbances of the integrated relative 
growth rate, and hence of the logarithm of the population. The 
curves of distribution of errors of population estimates will therefore 
be skew, since the distribution of the errors of the logarithms is sym- 
metric. 

According to the assumption of an underlying tendency to follow a 
logistic with fixed values of a and o(=a/b), a population of size p; 
at time ¢; would at time ft, be of size 


o 


Pp: = 15.2 
Pe = pate) (15.2) 
1+{—-1 
Pi 

were it not for the random disturbances. The logarithm of the actual 
population pz, will differ from that of }. by a positive or negative amount; 
the mean value of such differences is zero, and their variance is 
k*(t2—t:). The variance of such quantities as 


log p2— log pz 
na Mees Fa (15.3) 
Vit 


equals k?. As generally in estimating the variance of a population 
from a finite sample, we have as our best estimate of k? the sum of 
the squares of such quantities as that above, divided by one less than 
their number. This reduction of the denominator by one amounts 
to making some allowance for errors of sampling in obtaining a and co. 
Further allowance for such errors is made in the numerical work below 
by omitting the slight correction of the variance for the mean of the 
deviations, which is theoretically zero and is actually very nearly zero. 
If this correction were applied, k? would be reduced about one per cent. 

For numerical work it is convenient to take the logarithms in (15.3) 
to the base ten. With this convention the variance k? for the United 
States is found from the last column of Table III to equal .000 008 
848 9. (The last column is obtained from the same calculation as the 
others, but not from them; this would not be possible with sufficient 
accuracy.) 

16. Intercensal Interpolation. Estimates of population between 
censuses have been numerous. Complete accuracy in such estimates 
can only be attained by a combination of a perfect census with com- 
plete statistics of births, deaths, immigration and emigration. For the 
United States neither of these requirements is fulfilled. The censuses 
other than that of 1870 are doubtless fairly dependable; but even ap- 
proximately complete vital statistics have never existed. Conse- 
quently, curve fitting of some kind must be employed. 
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TABLE III 
ACCURACY OF POPULATION FORECASTS 
Paputatten exttmeted P P 
Year a be ny True population loses; 
censuses (millions) (millions) 
SS «d-dh ibaiewnre. ae ae eled wmiees 5.33 5.31 .00190 
aaa d tei Wid hae a ark 7.18 7.24 — .00332 
Di Ghat becmteaweenowesus 9.76 9.64 .00558 
I ia i eel aa a io dra ae a 12.87 12.87 .00015 
a ih ee ae la a a 17.17 17.07 .00255 
AES EEE ere eee ae 22.61 23.19 —.01111 
ia aa ne Bie oe aa 30.38 31.44 — .01498 
Es cciraoe eae eae eee 40.58 38 . 56 .02224 
a oe a ae i eae eras 49.15 50.16 — .00880 
ae apa eee 62.66 62.95 — .00197 
Ce Bi le alin 76.96 75.99 .00549 
Re er re oe aad 90.73 91.97 — .00591 
ESE 106.80 105.71 00444 

















Mean=— .00029 
Mean square= .000 008 168 24 


= 5 of mean square=.000 008 848 92 
k= .002 975 


The simplest and commonest device is straight-line interpolation. 
The straight line is also used by the Census Bureau for estimating by 
extrapolation. But the actual growth of a population is not linear, and 
in many cases deviates from linearity in a manner which can be de- 
termined rather accurately. Thus the graph of a freely multiplying 
population is an exponential curve, which is always concave upward. 
For such a population a straight line would always give too high values 
when used for interpolation and too low values in extrapolation. The 
use of an exponential has therefore been urged and sometimes adopted. 
But the exponential merely gives errors of the opposite kind, or else 
exaggerates the errors of the straight line. For consider a population 
growing according to the logistic, as is approximately the case for actual 
populations. If its equation 

1 dp 


— —=a-—b 
p dt , 


be approximated in any interval by an exponential 


1 dp. 


— =~ SG, 


P1 dt 


is always negative, while is zero. 


it is obvious that a ¥ 


@ log p: 
2 
Hence when p and p; are plotted on logarithmic scale as intersecting 
curves it appears that the exponential underestimates in interpolation 
and overestimates in extrapolation. For the upper half of the logistic 
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the error committed by using the exponential between two points is 
evidently greater than that of the straight line. 

On the basis of the assumptions of §/4 we shall show that a perfectly 
definite most probable interpolating curve exists, and that this best 
curve is not a logistic. Since one of these assumptions is that the most 
probable value of the rate of increase at any moment is given by a 
differential equation leading to a logistic, the last remark may seem 
paradoxical. But no one of the integral curves of the particular differ- 
ential equation (13.1) which has been derived from the data will give 
the correct values of the population at both ends of a census interval. 
One of these curves can be found which will give the census value at the 
beginning of the interval, and another to give the value at the end. 
The most probable values will be given by a curve lying between these 
two and shifting gradually from one side to the other of the strip which 
they bound. 

Divide the interval ¢, to ¢, between censuses into n parts, each of 
duration At. The probability that the relative growth rates in the 
successive intervals will lie between gi, gz, . . . gn and gitdg,..., 
Jnt+dg, respectively is found by multiplying together expressions like 
(15.1) to be 





1 29;—0;) 
———ae OS dg: dg ‘as dgn; 
(kV 2rAt)” 
where 9;=(a—bp;)At, p; being the population at some time in the 
ith interval. The values of gi, go, . . . , gn Which make this proba- 


bility a maximum are those which make 
Z(G: —9i)*/At, 


a minimum. The problem here is to find values pi, po, ... , Dn 
Pati, the first and last only being given by the data, minimizing this 


expression. It equals 
>( es .. a-+bp a, 
i=1 \p,At 


where Ap;=pit+i—p:. Increasing the fineness of the subdivision we 
have in the limit the integral 


ty p’ 2 
[ (z —a+bp) dt, 
a Pp 


where p’=dp/dt, to be minimized by choice of p as a function of ¢ 
with the assigned initial and final values. 
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Calling the integrand F, the well known equation of the calculus of 
variations, 


yields 





, , PD) 
Here pth ag = 3 2? ) . Making this substitution we have an 
dt dp dp 


exact differential equation, which on integration gives 


3 =b?p? —2abp+e. 


Separating the variables and integrating again, we have 
d 
t = to -| P e 
PV b2n? —2abp+c 
The integral is expressible in terms of elementary functions for any 


set of values of the constants. For c=a?, and not otherwise, the equa- 
tion is that of the logistic. The possible cases are as follows: 


c 1 
For c<0, p=- —— 
ba—Vat—c cos V—c (t—to) 


For c=0, a, a 2 
b1—- (t —to)?a? 














For 0<c<a’, p= © ; = 
ba—Va?—c cosh Vc (t—to) 
a 1 

For c=a’, Pe bite 
c 1 





For c>a?, =— ae = ; 
° ba—Ve—a? sinh Ve (t—t) 

The first and third of these expressions are analytically identical, 
while the fifth is obtainable from them by changing the meaning of 
t. The others are limiting cases. It is interesting to observe that the 
first of the expressions for p is periodic, that each of the next two is 
symmetric about %, and that the second, third and fifth become 
infinite for certain values of ¢. Such behavior is unbecoming to a 
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function which is to represent a population, and must occur outside the 
interval of interpolation. 

While these expressions for p are of theoretical interest they are not 
suitable for numerical calculation. This would in fact require simul- 
taneous determination of the constants c and t by successive approxi- 
mations after selecting one of the five forms; the amount of labor 
involved would be large. But the assured existence of a solution 
between two parallel logistics which lie close together shows how to 
proceed to obtain an excellent approximation. Calculate the two 
logistics and divide the difference between their ordinates proportion- 
ately to the time since the last census and the time until the next 
census. The resulting compromise gives the correct values at the ends 
of the interval, is continuous, and can differ from the theoretically best 
value by only a small fraction of the difference between the two logistic 
estimates, which is itself small. 

In Table IV are given the results of this operation for the population 
of the United States in the decade 1890-1900, with the modification, 
which seems desirable, of making the arithmetic compromises for 
logi p rather than for p itself. 











TABLE IV 
POPULATION OF THE UNITED STATES ON JUNE 1 OF EACH YEAR, 1890-1900 
(Millions) 
Cc spec : ore 
Pesteast Sem eng oy Compromise First = rences 
I as ha rvka tice crac ara 62.948 62.066 62.948 1.252 
Sn 64.290 63.398 64.200 1 265 
1892 65.646 64.745 65.465 1.277 
1893 67.017 66. 106 66.742 1 289 
1894 68.401 67.481 68.031 1.301 
ER ere 69.798 68.870 69 .332 1 312 
a ara eae 71.207 70.271 70.644 1.323 
1897 72.629 71.685 71.967 1 333 
Serer reo 74.062 73.110 73.300 1 343 
1899 75.506 74.547 74.643 1352 
1900 76.961 75.995 75.995 , 




















17. Prediction of Population. For extrapolation into the future we 
use a logistic having the values of a and b determined by (13.1) and 
starting from the latest census. For the United States this gives 








Year Population 
(millions) 
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These estimates are lower than those of Pearl and Reed! by an amount 
between 1.4 and 1.9 millions in each case. Much of this difference is 
attributable to the fact that the extrapolating function here used is 
adjusted to give the correct value for 1920, whereas Pearl and Reed 
base their determinations only on the censuses of 1790, 1850 and 1910. 

18. Asa further illustration of the process of interpolation consider 
a freely oscillating variable subject to disturbances. The assumption 
in this case is that the differential equation 


dp 
—4n=0 
qe? 


tends to be fulfilled but that the left member is subject to numerous 
small disturbances varying from moment to moment. Putting p’”’ 
for the second derivative we seek therefore to minimize 


te 
| (p’’+p)*dt. 
th 


If only the values of p at times ¢; and ¢, are given the integrand may be 
taken to be zero, since the differential equation, being of second order, 
yields on solution two arbitrary constants: 


p=A sin t—B cost. 
But if the first derivatives as well as the values of p at the ends of the 


interval are given we use the following extension of Euler’s equation.” 
To find a function p of ¢ minimizing 


ts 
1- | F(p, p’; p”, t)dt 
th 


and taking assigned values for itself and its first derivative at the 
limits, we replace p by p+ ew, where w is an arbitary function which, 
with its first derivative, vanishes for ¢; and ¢. The minimizing condi- 
yf 


be 


ta f 
| {oe ah +r ha 
i \ dp 5p’ 5p”’ 


Integrating by parts the second term once and the third term twice 
we have, since w=w’=0 at the limits, 


« (65F d 6F , @ SbF 
| of he ee) 
h dp dtip’ dé dp 
1 Studies in Human Bivlogy, p. 590. 
2 Cf. Hadamard, Legons sur le Calcul de Variations (Paris, 1910) pp. 134-136. 


tion is the vanishing for e=0 of This expression equals 
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which must vanish for arbitrary functions w. Hence 
6F d 6F , @ 6F _ 


dp dtép’ dt dp” 
In our case 
F=(p" +p)’, 

whence 

d‘p , ,@p 

ae! ae +?” 
Integrating, p=(A-+at) sin t+(B+5t) cos t. 
There is now the correct number of disposable constants. 

19. Probable Errors of Interpolation and Extrapolation. The prob- 
able error of an estimate of future population by means of a logis- 
tic satisfied for the latest census will evidently be zero for the time 
of this census and will increase continuously for later times. Indeed, 


FIGURE IV 
PROBABLE ERRORS OF INTERPOLATION AND EXTRAPOLATION 
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as remarked in §/5, the variance of log p, of which the probable error is 
.6745 times the square root, is directly proportional to the time. In 
terms of the specific variance k? the probable error of the logarithm of 
the population at time ¢ is 
6745 k Vt—t, 

where ¢; is the time of the latest census. The graph is a parabola. If 
we plot estimated population by a heavy line on a logarithmic scale 
as in Figure IV, we may indicate by dotted lines limits between which 
at any time the population is as likely as not to lie. The figure can be 
thought of as obtained by deforming a parabola until its axis lies along 
a logistic, but keeping rigid all chords perpendicular to the axis. 

The possible populations may be compared to a large number of 
wandering particles which start together at time ¢, and then migrate up 
and down independently by numerous short leaps. Meanwhile the 
vertical line carrying them is moving with uniform velocity to the right 
and is also slipping upward so that the center of the group of particles 
follows a logistic. Then at any time just half of the particles are within 
the limits indicated by the deformed parabola. 

The probability that at time ¢ the logarithm of the population will lie 
between log p and log p+d (log p) is 


_ (log p—log p)? 
1 2k? (t—t,) 
k/2r(t—hh)° 
where 7 is given by the logistic. 

For interpolation the probable errors are smaller. Not all the wan- 
dering particles are to be counted now, but only those which pass 
through the correct point at the final as well as the initial end of the 
interval. 

Let p be the most probable size of the population at time ¢. The 
probability that a population of given size at time ¢, will be of size p at 
the subsequent time ¢ is given above. The same expression gives the 
probability that a population known to be of size p at time ¢ was of the 
given size attimet,. Ift—t, is replaced by ¢,—t it gives the probability 
that this population will at time ¢, be of the size given by the census of 
the latter date. The probability that a population p at time ¢ took the 
census values at both dates is the product of the two, namely 


_ (a—h) (log p—log 7)? 
1 2k*(t—t:) (4—t) 


Qnk?/—t) (h—t) 





d log p 








(d log p)?. 
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By Bayes’ theorem, supposing all values of log p equally likely a 
priori, the inverse probability that the population counted at times 
t; and ¢, was at time ¢ of size p is the last expression times a factor 
independent of p. Our conclusion is, therefore, that log p is normally 
distributed with variance 





jet—t) (2-2) 
te —ty 


’ 


and that its probable error is therefore 





The graph of this function is an ellipse. Thus Figure IV gives a 
representation of the degree of probability to be attached to popula- 
tion estimates for a hypothetical case. For the United States the 
probable errors, which are given below, are so small that it was found 
impossible to make a satisfactory figure. The most probable size of 
the population is shown by the heavy line, and it is equally likely that 
the true population at any time should be represented by a point 
inside or outside of the system of distorted ellipses and parabola. 

For the United States we find, from the value of k obtained in $16, 
the following probable errors of logiy p for interpolation in a ten-year 
interval and for extrapolation. At the right are the factors by which 
p is multiplied or divided when the probable error is added to or sub- 
tracted from log p. These results are to be used with the population 
estimates of Table IV and §/7. It is noteworthy that the interpolated 
values of the population theoretically determined in §/6 have, even 
in the middle of the interval, a probable error less than three-fourths 
of one per cent, and that prediction a century in advance has a prob- 
able error less than five per cent. 











Probable error Corresponding 
of log p factor of p 

Interpolation: 

1 and 9 years from censuses. .................. .00190 0044 

2 and 8 years from censuses.................- .00254 g 

3 and 7 years from censuses.................- .00291 

4 and 6 years from censuses.................- .00311 

5 and 5 years from censuses.................- .00317 
Extrapolation: 


| Ee ee hee Pr er ee et re .00634 
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20. Further Problems. We have assumed accurate censuses. An 
improvement on this assumption could be made by historical studies 
to fix the probable error to be assigned to each census. The United 
States census of 1870, for example, is known to have been very inac- 
curate, and its lack of accord with the other censuses may be seen from 
Table ITI. 

On account of the assumption made in §/4 that fluctuations of the 
growth rate about the value given by the logistic are independent in 
successive elements of time, our interpolating function, though con- 
tinuous, fails to have a continuous derivative at the times of the cen- 
suses. It may be suggested that even more accurate results could be 
obtained from a curve having some “stiffness,’’ so as to yield continu- 
ous derivatives. By allowing the curve to pass slightly above or 
below each census point one might establish a criterion of goodness 
based on the probable errors of the censuses and the curvature some- 
what like that advanced by E. T. Whittaker in the paper cited in §7. 
This would involve an interesting combination of historical with 
mathematical research. 

A further improvement in population estimates can be made by the 
use of the incomplete data in existence concerning births, deaths, 
immigration and emigration. There is also the problem, prominent in 
the estimation of populations of cities, of the use of indices such as 
school attendance and number of telephones. The problem of op- 
timum estimates by means of such data may be attacked through an 
elaboration of the theory of random changes outlined above. 

In these and other problems pertaining to differential equations 
subject to perturbations a geometrical analogue will be helpful. Ina 
space of a number of dimensions equal to the number of variables 
considered, each point represents a possible statistical situation. The 
space need not be Euclidean; its curvature and the measure of distance 
in it will depend upon the intercorrelations and standard deviations 
of the variables. Interpolation may be studied by means of geodesics. 
The generalization of the theory of probable error outlined in $19 is 
related to the mathematical theory of heat conduction in this curved 
space. 
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Construction of an Index Number of Production 


CONSTRUCTION OF AN INDEX NUMBER OF 
PRODUCTION! 


By Woopuier THomas, Division of Research and Statistics, Federal Reserve Board 


Within recent years several attempts have been made to measure 
changes in the physical volume of production by means of index 
numbers.2 The various indexes constructed have served as useful 
tools in the analysis and interpretation of changes in business condi- 
tions. These indexes, however, possess certain admitted limitations, 
largely because of the scarcity or unsatisfactory character of available 
data suited for their construction or because of the statistical methods 
employed. Some of these limitations were recognized from the 
beginning and the application over a period of years of the indexes as 
compiled to definite problems of analysis and interpretation has 
revealed others. 

On the basis of experience in compiling and using indexes of produc- 
tion, it has been possible to evolve methods of construction which will 
in part remove some of these limitations. The application of these 
improved methods to the more comprehensive statistical information 
that has become currently available makes possible the construction of 
a more satisfactory index of production. Accordingly the Federal Re- 
serve Board has recently constructed and published a new index of in- 
dustrial production, which supersedes the production indexes previously 
published by the Board. It is the purpose of this article to show how 
the data used in this new index and the methods of construction em- 
ployed illustrate the problems involved in the construction of produc- 
tion indexes, with special emphasis upon improvements in data and 
technique as contrasted with previously constructed indexes.* 

Former Indexes and Their Limitations. Three indexes of production 
were previously compiled by the Federal Reserve Board—an index of 
production in basic industries, an index of manufacture, and one of 

1 Adapted from a paper read at the eighty-eighth annual meeting of the American Statistical Associa- 
tion at St. Louis, December 28, 1926. 

? Annual indexes were computed in 1920 by Walter W. Stewart and Edmund E. Day; see Journal of 
American Economic Association, March, 1921, and Review of Economic Statistics, Preliminary Volume, 
September, 1920, to January, 1921. Carl Snyder and Willford I. King have also compiled annual in- 
dexes. Monthly indexes were published late in 1921 by the Harvard Committee on Economic Research, 
early in 1922 by the Federal Reserve Board, and in 1923 by the Survey of Current Business of the Depart- 
ment of Commerce, and by the Standard Statistics Company. In addition several so-called indexes of 
general business activity using similar data and methods have been compiled. 

3 The new index and a description of data and methods employed in its construction were presented in 


the issues of the Federal Reserve Bulletin for February and March, 1927, and figures for current months 
are given regularly in that Bulletin. 
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mining. The index of production in basic industries was composed of 
22 series of manufactures and minerals for which, when the index was 
compiled in 1922, sufficient data were available to permit adjustments 
for seasonal variations. The other indexes were unadjusted. All were 
expressed in terms of 1919 as 100. 

By 1926 it was possible to construct a more comprehensive adjusted 
index of industrial production because many new, more adequate, and 
more promptly reported series had become available. Data for the 
years since 1922 permitted adjustment for seasonal variations for all the 
series included, which had not formerly been possible. In order to give 
a better picture of current changes in production, it was desirable to 
adopt a more recent base period and revised weights which would reflect 
such important changes as the growing proportionate importance of the 
automobile industry. A revision of the formula used was also desira- 
ble, in view of the recent important contributions to methods of con- 
structing index numbers. Experience in the use of production indexes 
had made especially clear the importance of an allowance for monthly 
variations in the number of working days. A distinctive feature of the 
new index is that it is based upon daily average production for each 


month. 








CHART I 
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General Characteristics of the New Index. ‘The new index of industrial 
production, which was constructed to incorporate these improvements, 
is, accordingly, more comprehensive and more representative of current 
industrial changes than the former indexes (see Chart I). It is made 
up of two component indexes, one of manufactures and the other of 
minerals and is computed from 60 series of monthly figures represent- 
ing average output per working day and adjusted for typical seasonal 














47] Construction of an Index Number of Production 317 


variations. The average of the three years, 1923, 1924, and 1925, 
was adopted as the base. The weights for the various industries in 
the manufactures index were derived from figures showing value added 
by manufacture, as reported by the Census of Manufactures for 1923 
and 1919, and those for the minerals index were derived from values 
produced, as reported by the Geological Survey and the Bureau of 
Mines for the three base-years, 1923-25, and for 1919. An aggregative 
formula was employed to combine the individual series into a composite 
index. From 1919 to 1922 the final index was an average of two sepa- 
rately computed indexes, one with 1919 weights and the other with 
weights for 1923 (1923-25 in the case of minerals). For 1923 and sub- 
sequent years only the later set of weights was used. 

Scope of the Index and Selection of Data. Statistics indicating current 
changes in volume of production, that can be promptly and reliably 
obtained, are limited to a relatively small number of the thousands of 
products of industry. Those available figures, however, which are 
suitable for use in a production index, are in themselves so important, 
or are used as a basis of further production in such a large number of 
industries, that they either directly measure or indirectly indicate the 
production of a large proportion of industry. Forexample, fluctuations 
in the production of steel ingots fairly adequately measure concurrent 
movements in the more advanced stages of steel manufacture and less 
closely represent the broader swings of manufacturing activity in 
industries making finished products from steel. In some industries, 
the textiles, for example, no statistics of products are available, but 
figures showing the current consumption of certain basic raw materials— 
silk, wool, and cotton—are used. Changesinthe volume and character 
of the finished products of these industries, owing to the introduction 
of a new material, are not reflected in the index. An example is the 
increasing use of rayon as a textile material. 

For several industries there are no promptly reported reliable 
monthly series of statistics which either directly or indirectly represent 
the volume of production. Some of the more important of these are 
canning and preserving, butter, cheese, and condensed and evaporated 
milk,'! chemicals and drugs, railroad repair shops, and musical instru- 
ments. In a number of others the representation is exceedingly in- 
direct, e.g. machinery, hardware, and plumbers’ supplies by iron and 
steel, clothing by textile materials, printing and publishing by paper, 
and furniture by lumber. 

Another characteristic of data available for production indexes is 


1 Monthly data for this industry are compiled by the Department of Agriculture but are reported too 
late for inclusion in an index compiled for current use. 
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that for the most part they cover relatively simple articles whose 
quantity can be accurately measured in terms of uniform physical 
units. A piece of machinery of one year, for example, is frequently 
different from that called by the same name a few years later and there- 
fore no adequate production statistics for machines are available. 
There are, however, some exceptions, such as automobiles and loco- 
motives, and these products are represented simply by total number 
produced, which gives no indication of changes in degree of elaboration. 
A million 1927-model automobiles are assumed to be the same as a 
million 1917 models. This has the effect of understating the long- 
time growth of manufactures, as the proportion of elaborately con- 
structed articles is constantly increasing. 

On the basis of the foregoing considerations, practically all reliable 
series measuring monthly production, available promptly, compiled 
since the beginning of 1923 at the latest, and representing industries not 
more adequately covered by other series, are included in the Federal Re- 
serve Board’s new index. The index is adjusted for seasonal variations 
and, therefore, most of the series were selected to cover a sufficiently 
long period of time to provide measurement of typical seasonal move- 
ments. Since the scope of the index is prescribed by the requirements 
of currentness and of measurement in physical units, it includes only 
manufactures and minerals—construction and agriculture are excluded, 
except indirectly through industries producing building materials or 
using farm products as raw materials. The manufactures index is 
composed of 52 series covering industries with a value added by manu- 
facture in 1923 of $10,000,000,000, and these industries in turn in- 
directly represent major industrial groups with a value added of 
$20,000,000,000—80 per cent of the total for all manufacturing in- 
dustries. The average value of the production for the three years 
1923-25 of the eight mineral products included in the index totaled 
$3,360,000,000—77 per cent of the total for all minerals. A list of 
industries and individual series and the relative importance of each 
are given in the tables at the end of this article. The two component 
indexes are shown in Chart II. 

Daily Average Output. A distinctive characteristic of the new index 
is that it is based upon figures representing average output per working 
day instead of total production during the calendar month, and is thus 
not influenced by changes in the number of Sundays and holidays from 
one month to another. The adjustment of figures representing total 
monthly output removes the effect of a variable and measurable 
influence extraneous to the operation of business forces and provides a 
series in which the monthly figures are directly comparable with each 
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other. The number of working days in each industry, used to make 
these adjustments, was arrived at through a special inquiry conducted 
jointly by a committee composed of representatives from several 
government bureaus. 

The two uppermost curves on Chart III illustrate the effect of using 
daily average figures as contrasted with monthly totals. The top 
curve is an unadjusted index of manufacturing production, which was 
compiled from statistics of total monthly output. The middle curve 
shows the new index computed from daily average data without any 
adjustments for seasonal variations. The first curve shows the in- 
fluence of the varying number of working days in the different months 
by the more erratic nature of its month-to-month fluctuations as com- 
pared with the second curve, the differences being especially evident 
whenever a long month comes between two short ones, as in the case, 
for example, of February, March, and April. 

Adjustments for Seasonal Variations. The daily-average figures 
were adjusted for recurrent seasonal movements before they were 
combined into the composite index. The methods used to make these 
adjustments are too complicated to permit a full description of them 
in this article. In brief, percentages of actual figures to twelve-month 
moving averages, centered at the seventh month, were computed, 
and from the array of these percentages for each month a typical item 
was selected and a monthly adjustment factor derived. The lowest 
curve on Chart III shows the index after adjustment for seasonal 
changes, and comparison of curves 2 and 3 gives a general idea of the 
seasonal variations which characterize manufacturing as a whole. 

The index for recent years has an advantage over previously com- 
piled indexes because the current period has been considered in com- 














American Statistical Association 


CHART III 





| I 
































1919 1920 1921 1922 1923 1924 1925 1926 


puting seasonal adjustment factors. Study of seasonal fluctuations 
during recent years indicates that in a number of industries there have 
been important changes in the character of such variations. These 
shifts are constantly taking place, sometimes suddenly, sometimes 
gradually over a period of years. Accordingly it was necessary in such 
industries to compute monthly adjustment factors which changed over 
a period of years. Some outstanding examples of shifting seasonal 
movements occurred in the production of automobiles, in cotton con- 
sumption, in meat packing, and in flour milling. 

Choice of 1923-25 Base. The average for the years 1923, 1924, and 
1925 was selected as the period to which the new index and all of its 
components are referred as the common basis of comparison. This 
period fulfills the requirements of recentness and familiarity. Further- 
more, it was relatively free from extreme variations, although in any one 
of these three years there were conditions in individual industries which 
would have vitiated its use singly as a basis of comparison between 
industries. Sharp fluctuations occurred in coal production, and there 
were significant contrasts of considerable magnitude between the 
recession which took place in the wool and leather industries and the 
expansion in the petroleum refining and rubber industries. The use 
of the three years as a base not only gives a more typical comparison 
between the indexes for these industries than would be provided by the 
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use of any single year, but also diminishes the dispersion between the 
individual series during the current period. 

This lessening of dispersion has a technical advantage in that it 
diminishes the undue influence of extreme items on the composite 
index. The effect of wide dispersion is more pronounced where there 
are relatively few items, as in the case of production indexes as con- 
trasted with price indexes, and is particularly important if the index 
is an average of relatives.! 

Basis and Representativeness of Weights. Next to the comprehensive- 
ness of the basic quantity statistics included, the representativeness 
of the weights (or measures of relative importance), applied to the 
several series before combining them, is the principal element deter- 
mining the accuracy of a production index. In the Federal Reserve 
Board’s index, as in most indexes, value of output is used as the basis 
of weighting. This figure is a resultant of all the various factors of 
production, such as labor expended and capital equipment applied, 
and at a given time such figures for a number of industries probably 
represent the importance of these industries relative to one another 
better than any other available indicator. In the case of manufac- 
tures, however, the total value figures reported by the Census of 
Manufactures are made up to a large degree of the cost of raw materials, 
which are products of farms, mines, and forests, and of other stages of 
manufacture, and do not represent the actual addition to the flow of 
goods contributed by the manufacturing processitself. Forthisreason, 
in indexes of manufactures, weights are based upon value added by the 
process of manufacture, which is total value less cost of raw materials. 
The weights for minerals were compiled from total value of output as 
reported by the Geological Survey and the Bureau of Mines. 

It was stated above that statistics suitable for inclusion in an index 
are available for only a small number of the products of industry, and 
that an industry or a group of industries is often represented in an 
index by the output of one or a few important products or by the 
consumption of a basic raw material. On this basis each series is 
weighted in accordance with the relative importance of all the indus- 
tries that it represents in the index. The procedure in arriving at 
measures of relative importance of the various series included in the 
index of manufactures was as follows: The series were arranged by 
industries and by major industrial groups such as iron and steel, textiles, 


1 In indexes in which the base is a computed trend curve, there is need for frequent revision, owing to 
the failure of current data to follow the curves computed on the basis of figures for earlier years and pro- 
jected into the present. Data covering years of pre-war, war, and early post-war periods, during which 
far-reaching changes took place in industrial arrangements have not generally provided satisfactory 
indicators of the trend of production for the present and the immediate future. It needs yet to be 
proved, in fact, that industrial changes conform even roughly to mathematical formulae. 











322 American Statistical Association [52 


and food products. The value added by manufacture for each group 
was distributed among the industries by which it is represented in the 
index in proportion to the respective values added by manufacture 
for these industries. These derived industry figures were in turn dis- 
tributed among the individual series in proportion to value of product. 
(Value-added figures are not available for smaller subdivisions than 
industries.) This procedure provides a measure of relative importance 
of each series in the index. The derivation of the actual weight factors 
applied to the quantity figures in constructing the index is a problem 
relating to type of formula used and is described below. 

Although value figures provide the chief basis of weighting, it was 
necessary in individual cases to make adjustments for such factors as 
reliability of the statistics, their limited comprehensiveness, and the 
degree to which they are typical of industries in the same group for 
which data are not available. A case in point is the large weight 
generally given to pig-iron production, which has had an appreciable 
effect upon most existing index numbers of production. This series, 
because of its basic nature, was at one time considered the most repre- 
sentative of al! production statistics. Within recent years, however, 
owing to the increasing use of scrap iron and steel in the production 
of steel, the pig-iron figures have become less representative of the iron 
and steel industry, and therefore of industrial production in general. 
The production of steel ingots, for which figures have been available 
since 1917, furnishes a more satisfactory measure, and in the new index 
was given a much larger weight than pig iron. 

Another problem of weighting is presented by the changes which take 
place over a period of time in the relative importance of various indus- 
tries and products as compared with one another. Unless allowance is 
made for these shifts, the index becomes progressively less dependable 
as an indicator of current movements. In most monthly indexes of 
manufactures in use, the weights remain constant throughout the 
period covered and at present are generally based upon data given in 
the 1919 census of manufactures. Changes in the importance of the 
shipbuilding and the automobile industries since that date afford two 
outstanding examples, and a corresponding revision of the weights 
given these two industries makes a certain amount of difference in the 
results obtained. Revision of the weights to a more recent period, 
however, does not completely solve the problem: Recent, weights are 
better for current months, but if the index extends over a long period 
of years current weights are no more typical of past years than past- 
year weights are of the present. For this reason, in compiling the 
index of industrial production two separate sets of weights were 
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employed. One set was based on 1919 figures and the other on 1923 
data for manufactures and on 1923-25 data for minerals. With these 
weights, two different sets of index numbers were compiled for the 
period from 1919 through 1922, and the average of these two was taken 
as the final index for that period. This procedure is a problem of 
formula and is further discussed below. 

Selection of a Formula. After selecting the basic data, making the 
necessary adjustments and determining the base period, the next step 
in the process of constructing the index was to combine the figures for 
individual industries into a composite expression of the flow of products 
from industry asa whole. The basic data are heterogeneous in a great 
many ways that make their combination difficult. They are expressed 
in a variety of non-comparable units—tons, bushels, barrels, gallons, 
bales, and others; some of them represent industries far more important 
than others; some cover a larger proportion of the industries which 
they represent than do others; some are actual products, some raw 
materials consumed, some are products of one stage of production and 
materials for another, while others measure only the relative activity 
of machinery engaged in production. 

In combining these figures it was necessary to express them in terms 
of a common unit and to make sure that each series occupied a place in 
the composite proportional to the relative importance of the industry 
represented. The method of making this combination may be 
expressed in terms of a mathematical formula. Such formulae are 
almost innumerable, expressing different methods and combinations 
of methods, but it is possible to divide them, on the basis of type of 
average used, into two main groups: (1) averages of relatives and (2) 
ratios of aggregates. In the more familiar average-of-relatives method, 
the reduction of each series of production data to relatives in terms of a 
common base period makes them homogeneous for the purpose in view, 
regardless of the many differences in the original data in respect to 
units and degree of representativeness. In conception this method is 
simple and in practice it is not unduly involved. 

In the aggregative method, each series of quantity figures is multi- 
plied by a conversion factor which serves two purposes: (1) it reduces 
the quantity figures to values, putting all the series on a comparable 
basis, and (2) it weights them in accordance with the relative impor- 
tance of the industry represented. The values thus computed are 
added for each date separately and their aggregates are converted into 

1 For technical reasons, it would have been desirable to have base-period weights for the manufactures 


index, also, but the censuses of manufactures are taken only in alternate years and cover but a single 
year. At the time of the computation of the index, the 1923 census was the latest available. 
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percentages of the base period aggregate. These percentages are the 
index numbers. 

The difficulty of obtaining the conversion factors needed in the aggre- 
gative method has made the method an unpopular one for production in- 
dexes. They were computed for this index, however, by a relatively 
simple procedure. Derivation of total weights, or measures of relative 
importance, from value added figures and their assignment to individual 
series have already been explained. In deriving the weight factors 
these total weights for each series were divided by the quantity figures 
for the period to which they apply. This process involves only one 
step in addition to that of deriving the total weights which would be 
the weights used if the index were an average of relatives.! For in- 
stance, the total weight assigned to the cotton goods industry in 1923 
was divided by the number of bales of cotton consumed by the industry 
in that year in order to obtain the 1923 weight factor for cotton con- 
sumption.2 The factors thus computed resemble average prices, but 
in reality are abstractions. The so-called quantity figures, while 
accurately reflecting changes in output, do not represent total quantity 
of manufactured product. On the same basis, the value figures do 
not represent total value of output. Some of the quantity figures 
measure production of a single important product, some consumption 
of a basic raw material, and others activity of machinery. 

The aggregative method was found, on the basis of both a priori and 
empirical considerations, to be peculiarly adapted to the particular 
characteristics of the Federal Reserve Board’s new index number. 
There are certain general advantages of this form: (1) it is free from 
statistical bias inherent in many forms involving averages of relatives, 
(2) its base can be shifted freely without recalculation, and (3) it is 
simple to compute. The first two of these advantages were of con- 
siderable importance in this case, owing to the fact that the base 
period (1923-25) and the period for which weights are available for 
manufactures (1923) were not coincident. The aggregates used in 
computing the index by this method are first compiled without regard 
to base, and any one of them or the average of any set can be used as a 
base without affecting in any way the existing relationships. 

In contrast, the results obtained by an average-of-relatives method 
vary with the base period used, and somparisons between index num- 
bers for different periods are, therefore, somewhat affected by the 
choice of base period. In this method, also, the relation between the 


1If the weights applied to the entire base-period, 1923-25, the aggregative form and the arithmetic 


average of relatives would give exactly the same results. 
2 An illustration of the process of deriving weights was given in the Federal Reserve Bulletin for March 


1927, p. 174. 
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period to which the weights apply and the base period may affect the 
results appreciably. An arithmetic average of relatives with weight 
period and base period not coincident, is subject to an upward bias, and 
therefore this method should not be used with 1923 weights and a 
1923-25 base. Any weighted geometric average is likely to be biased. 
The degree of bias tends to vary according to certain characteristics of 
the index—primarily the number of series and range of variation of the 
figures from the base, 7.e. their dispersion. At times these considera- 
tions are not of appreciable importance, yet at other times they make a 
difference too large to be ignored.! Since in the Federal Reserve 
Board’s index of manufactures, the number of items was small, the dis- 
persion was large, and weights were not available for the chosen base 
period, it seemed advisable to use an aggregative formula. 

The computation of two index numbers with different sets of 
weights—one for an earlier period and another for a later period—and 
the averaging of these two indexes in the earlier years to obtain the final 
index was explained above under the discussion of weights. This gave 
the formula for these earlier years a double aspect. The formula thus 
adopted for use may be expressed symbolically as follows: 





” 2 Pos Iz = Pio Iz 
2 P23 = Pio Go 





I oa 


I,2 represents the index for a given period, gz and q, the physical 
volume of production of a product for the given period and the base 
period (1923-25) respectively, pz and p, the corresponding weight 
factors of that product, and the p q’s signify aggregate values of all of 
the products included in the index, individual sets of values being 
obtained from different combinations of the weights and quantities. 
For 1923 and later years the formula used includes only the first term 
under the radical.2 The same procedure was followed in computing 
indexes for the major industrial groups where the differences were 
significant. 

This formula is similar in some respects to the so-called ‘‘Ideal”’ 
form recommended by Professor Irving Fisher in his Making of Index 

1 For a full treatment of the characteristics of index number formulae, see Irving Fisher, The Making 
of Index Numbers. More concise discussions may be found in Index Numbers of Wholesale Prices in the 
United States and Foreign Countries, by Wesley C. Mitchell, United States Bureau of Labor Statistics 
Bulletin No. 284, October, 1921, or in some text book on statistics, such as Statistical Analysis by 
Edmund E. Day, Business Statistics by Frederick C. Mills, or the Handbook of Mathematical Statistics 
by Rietz, Young and others. 


2 In obtaining these averages for the 1922 months, the indexes with 1923 weights were given a weight 
of two and those with 1919 weights a weight of one, 1923 being considered more typical of 1922 than 1919. 
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Numbers.1. The ‘‘Ideal” form, however, in its strict application de- 
mands a different set of weight factors for each period covered by the 
index. This is undesirable because of the labor involved, impracticable 
because the data for so many sets of weights are not available, and 
unnecessary because only over long periods of time are changes in 
weights large enough to be significant. 

Tests with Other Types of Formulae. In order to test the results 
obtained by this index, annual indexes were computed by several other 
formulae for the year 1919. That year was selected because it was the 
most distant from the base period, because of the wide dispersion of its 
individual series, and because the 1919 census provided data for weights. 
Nine indexes were computed—three aggregative indexes, three arith- 
metic averages of relatives, and three geometric averages of relatives. 
The first of each set was computed with 1923 weights, the second with 
1919 weights, and the third was an average of the other two. Chart 
IV illustrates the results obtained. The census index shown has been 
computed from the quantity data reported in the biennial censuses of 
manufactures and is undoubtedly the most comprehensive available 
measure of changes in the volume of manufacture.2 Comparison 
between the Federal Reserve Board’s index with 1923 weights and 
this Census index showed increases between 1919 and 1923 of 18.5 
per cent in the former index and 22 per cent in the latter. The increase 
in the Federal Reserve Board’s index with 1919 weights in the same 
period was 21 per cent, and that in the Board’s average index, the one 
adopted, was 20 per cent. The discrepancy between the increase 
shown by the last-mentioned index and that of the Census index is 
slight and can be attributed to important differences in data.* 

Significant variations between results obtained by different index- 
number formulae are shown on the chart. The three arithmetic aver- 
ages of relatives (4, 5, and 6 on the chart) and two of the geometric 
averages (8 and 9) are higher than the aggregative indexes (1, 2, and 3) 
and the census index (10). The geometric average with 1923 weights 


1 This formula is expressed symbolically as follows: 


I =v2 Podz = Pre, 
oz 
Z Po go 2 Pr Yo 
2 This index will be published in a Census monograph The Growth of Manufactures prepared by Edmund 
E. Day and Woodlief Thomas. It revises and brings up to date a similar index compiled in 1920 by 
the Harvard Committee on Economic Research. The index is a geometric average of relatives, with 
weights computed by averaging the proportionate importance of the individual series in the base year 
(1919) and in the particular year to which the index applies. For the purpose of the comparison shown 
on the accompanying chart, it has been recomputed on the basis of the 1923 index as 101, the figure for 
the Federal Reserve Board’s index for that year. 
3? The Census index, having a larger number of the more elaborate products of industry, increases 
somewhat more rapidly, as is further indicated by the increases shown between 1923 and 1925—5.4 per 
cent for the Census index and 4.0 per cent for the Board’s index. 
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CHART IV 
PERCENT 5-—a PER CENT 
150 , ° : 150 
ANNUAL INDEX NUMBERS OF MANUFACTURE 
AS COMPUTED BY DIFFERENT FORMULAE 
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(7 on the chart) is lowest. If the weights used applied strictly to the 
base period, an arithmetic average of relatives with base-year weights 
should agree with an aggregative index similarly weighted. The extent 
of disagreement between the arithmetic average with 1923 weights (4 
on the chart) and the aggregative with 1923 weights (1) measures the 
degree of upward bias in the arithmetic average. The average of the 
two geometric indexes (9 on the chart) might be expected to agree more 
closely with the average of the two aggregative indexes (3), were the 
weight period and base period exactly coincident. These differences 
illustrate the advantage, if not the necessity, of employing the aggre- 
gative form where weights strictly applicable to the base period are not 
available. 
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Summary. The new index of industrial production of the Federal 
Reserve Board is broader in scope than the indexes previously compiled 
by the Board, and the methods of its construction have been improved 
in many respects on the basis of experience. It is distinctive in that it 
is based upon statistics of daily average output, is weighted so as to 
conform to industrial conditions in both current and earlier years, and 
is computed by means of an aggregative formula. 


DATA USED IN INDEX OF MANUFACTURES 























| Relative 
Groups and industries | Series magnitude 
(per cent)* 
SE A GE, AUD SE DRIIIOED go ob no 6 cc sd cecccccccccenececcassesceseseess 23.0 
OEE ETE OT Pig-iron production as oe al 2.2 
Steel works and rolling mills and other} Steel-ingots production................-. 20.8 
products. 
ns cal atbe eee eeenab en heen Ree Nea Seek wens 20.5 
ES ree Mill consumption of raw cotton........... 10.2 
ES RE. PR a ea rere TT ree 6.8 
Mill consumption of raw wool. . 3.5 
Percentage of loom and spindle hours active 2.0 
Percentage of carpet and rug loom hours 
PE Ccigte atest Senennetee eines cue 1.3 
SOP EOL ETAL (Tee LAL EET ROO NE TE Ee 3.6 
Deliveries of raw silk to mills............. 2.3 
Percentage of looms active............... 1.2 
ng cobiel © wAWORENS COS CE ONT ENECONE4N SONWbONECRNE 10.1 
nN REE FE eT eT ree TTT 6.2 
Hogs slaughtered under Federal inspection . 3.7 
Cattle slaughtered under Federal inspection 2.1 
Calves slaughtered under Federal inspection 2 
Sheep slaughtered under Federal inspection 2 
Flour-mills “amp 2 oe eae es cael Wheat-flour production. ...............+. 2.4 
Sugar refining. che Waiee tone ecu Sr, <ccedcesceenen 1.5 
LN ue cere Gob Gia waa << on ioadas eaeunen cos 11.2 
RE! | a ee ern er er 8.2 
Newsprint production. ...............+5. 1.1 
Book-paper production. ...............+- 2.0 
Wrapping-paper production.............. 1.3 
Fine-paper production. ...............+-- 1.0 
Box-board production. ...............+:- 1.8 
Mechanical-pulp production. ............ 2 
Chemical-pulp production................ 8 
NE wn aga ewneewae han Production of paper-board shipping boxes, .6 
in square feet. 
Newspaper publishing..............| Newsprint consumption................-. 2.4 
6 0 eal KORO EUNEEEGRECORSERE ETS N OS 4 HC Cee 08 4% 9.9 
Lumber and timber products. .......| Lumber production................e+00++ 9.1 
ee ne og dn wa eta il Oak and maple flooring production. ....... 8 
nc ccinnall seth ewe eee RR eet htnweieesnne nennnss 6.7 
Motor vehicles, including bodies and| Production of automobile passenger cars and 6.0 
parts. trucks. 
I oie oy ig aan emeen a Locomotives completed.................- 4 
Ship and boat ane CIR | RE I renee 3 
i IN occ) oinnceone o0'e-ncennesiesnsnevauvecesevccees 4.0 
Leather, tanned, curried and I os glace 1.6 
Sole-leather production.................. .6 
Upper-leather production................ 9 
ENCE Se 3 
Ni cia ig wath sb a adow ad ae .3 
FRESE RSE err 3 
ETT TTT Production of boots and shoes............ 2.4 
SUS, GRAS, AUD GLARE TRGBGORB, 2.0. occd cecvccceccarenssccescccesccvccceceosecs 3.9 
te oes ire gew esta enki a ieee I, cand cnevensaeaunwre 1.3 
EE eh eh ee rr 1.3 
Face-brick production. ...............+.. 9 ( 
Paving-brick production. ................ 4 
ON a ee ae Plate-glass production................... 1.3 
d 


* Derived from figures showing value added by the process of manufacture, given in the Census of 
Manufactures of 1923. 
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| Relative 
Groups and industries Series magnitude 
(per cent)* 
METALS AND METAL PRODUCTS, OTHER THAN 
ee nau 6 hee a SAPS ACA RAACRENEESeS On eOSeeeREA OS 4. 
er smelting and ~—pp .......-| Blister-copper production................ 2 


Lend 

Lead smelting and refining. . 

Zinc smelting and refining. . 

CHEMICALS AND ALLIED reetette baeaeatea 
Petroleum refining 


RUBBER PRODUCTS... 
Rubber tires and inner tubes........ 


TOBACCO MANUFACTURES 
Cigarettes 
Cigars 
Chewing and smoking and snuff ..... 


Crude-lead production 
Slab-zine production 
Deliveries from port warehouses.......... 


Gasoline production . 
Kerosene production 
Fuel-oil production 
Lubricating-oil production. .............. 


By-product-coke production 
Beehive-coke production 


Pneumatic-tire production............... 
Inner-tube production 





— os © 
POM AD WNOWFNOCOBUANONS 


on — 
. 


—- 











* Derived from figures showing value added by the process of manufacture, given in the Census of 


Manufactures of 1923. 


DATA USED 


IN INDEX OF MINERALS 








Minerals 


Series 


Relative 
magnitude 
(per cent)f 





Bituminous coal 
Anthracite coal 
Crude petroleum 
Iron ore 


Deliveries to pipe lines 

Shipments of ore through upper Great L akes 
ports. 

Mine production 

Crude-lead production 

Slab-zine production 

Mine production 





ooo 


Grwon oH 














t Deriv ed from figures of total value produced in the years 1923- 192 5, as reported by the Geological 


Survey and the Bureau of Mines. 


INDEX OF INDUSTRIAL PRODUCTION 


[Adjusted for seasonal variations. 


1923-1925 average=100] 

















Months 1919 | 1920 1921 1922 | 1923 | 1924 1925 | 1926 1927 
OEY Cee Ta ar 83 95 67 74 100 98 105 106 106 
February 80 95 66 76 100 102 105 107 109 
NO eae irl ot ts pe ahr 77 94 65 81 104 101 105 108 111 
so eg eR eae BS 78 88 65 77 107 95 103 108 109 
| RIS iat ieee ent 77 90 66 81 107 89 103 107 lll 
SIGS Sa nae ape 83 90 65 85 105 85 101 107 108 
SS Nae Rarer nS yeaa 87 88 64 st 103 83 103 107 105 
ol i ee ee aS 89 88 66 83 102 89 103 111 
a a 87 85 67 88 100 94 102 113 
a Ew aa 86 82 71 94 99 95 105 111 
REEL reer eo 85 75 71 97 97 97 106 108 
ES EE TR a Stet Se 85 70 7 100 96 100 108 105 
I a in nl Gente 83 87 67 85 101 95 104 108 
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[Adjusted for seasonal variations. 
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Months 1919 | 1920 | 1921 | 1922 | 1923 | 1924 | 1925 | 1926 | 1927 
CD cs tne eee een waee eel 84 96 65 73 100 99 105 108 104 
os tne se scl wiag ada meen 81 97 64 75 100 102 106 109 107 

i K¢spepitreeeditaredeues 78 95 63 78 103 101 106 108 110 
ee eh eee eee eninge 79 89 63 81 106 95 103 108 109 

ay.. 78 91 65 86 107 88 103 107 111 
ied whe ds. a's 6 ork ae eee ent 84 90 64 90 104 83 101 107 108 
RS arene errere 88 88 64 89 102 82 103 107 107 
Gh ink aid kaee eekateh one 90 88 66 87 101 89 103 112 
ccs tonetwused wana 87 85 67 89 101 94 104 113 

EE nan ahaa seins ees be ean 86 80 71 94 98 95 107 lll 
<6 16 cee needed ee 89 72 72 98 96 97 108 106 
Ss 6 onc ee ikea eee’ 87 67 70 100 95 101 110 103 
OCR Se 84 87 67 87 101 94 105 108 

INDEX OF PRODUCTION OF MINERALS 
[Adjusted for seasonal variations. 1923-1925 average=100] 

Months 1919 | 1920 | 1921 | 1922 | 1923 | 1924 | 1925 | 1926 | 1927 
EE re eT eT 78 85 80 76 100 163 105 93 117 
Pl 6 ibnvewndseexenwawen 69 84 77 87 100 106 101 98 120 
i ea eae awe ieee 68 89 74 97 106 101 98 108 122 
EEC Fao eer eee 73 83 73 53 112 92 99 107 106 

Serer re Ter re Te 75 85 7 53 108 93 104 103 108 
at «dba ehew ee enteunh sehen 78 90 71 58 107 91 99 104 103 
Ce a ae ie cela hacen e miaiialecd 82 89 67 56 109 90 102 105 97 
0 rer eae ee 79 92 69 62 110 92 107 109 
CN diced anne vend exten 85 85 68 82 98 97 90 111 
ee oa ike eee ee 88 91 73 92 105 97 91 116 
tw hag gma ememen 61 92 67 94 104 96 94 118 
CE i cnneteannaneeewas 74 91 67 99 99 100 94 120 
Rs. ange eawaieee ens 
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A Study of Spurious Correlation 


A STUDY OF SPURIOUS CORRELATION 


By M. R. NErretp 


Under certain conditions of correlating data, caution must be used 
in accepting the correlation coefficient at its face value. Factors may 
operate to make the net correlation higher or lower in value than the 
obtained or gross correlation. The term “‘s~urious correlation” has 
been applied to some special cases where this is true. There seems to 
be no apparent reason why “spurious correlation” should not apply 
to all cases where a correction must be made to obtain the true or net 
correlation. Mention may be made of Spearman’s formula for allow- 
ing for the influence of errors of observation on the correlation coeffi- 
cient; of the increase in the correlation value due to heterogeneity of 
material; and of the reduction of correlation due to mingling of un- 
correlated with correlated pairs. 

It is common practice in handling economic data to avoid such a 
spurious result in correlating time series. Recently Musselman! 
has pointed out that spurious correlation is present in such material 
as psychological data when the scores on a test are correlated with the 
total scores on a battery of tests which contains the first test as one 
of its constituent elements. As long ago as 1897 Pearson? warned 
against the presence of spurious correlation when dealing with ratios 
or indices, and suggested the possible application of his conclusions to 
economic phenomena. The warning becomes particularly important 
when correlating relatives, either chain relatives or those on a fixed 
base. 

Following Pearson we may let 2, 22, 23, 24 be the absolute sizes of 
any four correlated subjects; mm, mz, m3, m, their mean values; o;, o, 
o3, o4 their standard deviations; riz, r23, 734, T14, T24, Tis, the six coeffi- 
cients of correlation; «, €, €3, €4 the deviations of the four subjects 
from their means, 7.¢. 271=m+«, etc.; t13 and ia, the mean values of 
the indices = and = respectively; and N the total number of groups. 

3 4 


1“ Spurious Correlation Applied to Urn Schemata,”’ this Journa, September, 1923. 
2“ On a Form of Spurious Correlation,’ Proceedings of the Royal Society, Vol. LX, 1897. 











332 American Statistical Association [62 





. ; - ; 
Neglecting cubes of —, it can be shown that the correlation between 
m 


a M15] Te. 
the indices — and — is! 














X3 4 
0, G2 01 04 02 03 03 04 
a a ae eee ee eee 
_ mM Me mM, M4 Mz M3 M3 M4 
7 2 2 2 2 (1) 
C1 03 Oi G3 02 or 02 04 
or hoe he a——"—— Ths eae ik 2 
m m3 m, M3 me ma Me ™4 
V1 Ve Ti2— Vi V4 T14— Vo Vs To3+ V3 V4 134 (2) 





Voe+v—201 03713 Voer+veg—2 Ve V4 T24 
, . : , a x 
Now in (2) if the four subjects forming the indices — and — are 
3 XM 
uncorrelated, then r2=714=723=134=0, and p=0. In other words, 
if the absolute values are uncorrelated, the indices formed from those 
values are uncorrelated.2 But if two of the absolute values are 
03 04 


identical, say 73=24, then r34=1, —=—, v3=v4, and (2) becomes 
m3; ™4 


V1 V2 T12—V1 V3 T13—Ve Vs Te3-+V3" 


Vo? +3? — 201 U3 113 V v2? +03? —2v2 U3 723 








(3) 





p 


This would have as an economic application the case of relatives 
, — v1 Le . 
with a common base, for the indices are now — and —. If again we 
3 v3 
assume no correlation between the absolute values 2;, 22, 23, 7.€. we 
assume rp =113=723=0, then (3) reduces to 


v3" 


Verto? = Vo2-+o3 ad 


This is the measure of correlation, purely spurious, due merely to the 
common base in the two indices, or that which results simply from the 
algebraic processes involved. In the special case when »; =v2=03 





p=.0. 


Suppose, however, that there is some degree of correlation between 
the variables x, 22, 23. We can obtain an idea of how the spurious 


1Elderton, Frequency Curves and Correlation, p. 124. 
2 There is another kind of limiting case for 1. = v2 = v3=v4. For this case p =O when riz = rig = 723 = T34 
>0 but <1. 
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‘ 2 = Ty  ) 
correlation affects the total correlation between the indices — and — 
T3 Z3 


by assuming with Pearson that ».=v2=v;3, and that the correlation 
between z; and z2=r2=7, while that between 2; and 23, and that 
between 22 and 2x3 equals r' (7.e. r13=723=7'). Substituting in (3) 
we have 

_ rv? — rly? —riy? +o? 


r—r 
p=.5+.5(—"), (5) 





‘“‘The formula illustrates in a simplified case how the correlation 
in the indices diverges from the spurious value 0.5 as we alter r and r' 
from zero, 7.e. as we introduce correlation. According as r, the corre- 
lation of the numerators, is greater or lesser than r! (the correlation of 
the numerator with the denominator) the actual index correlation 
can be greater than or lesser than the spurious value.”’! 

This is a point the importance of which has not been sufficiently 
stressed. It will repay further analysis. We start with the indices 
2 and = and we seek the correlation coefficient pz, ». There are, 


v3 ta % 

however, three distinct groups of coefficients according as x, appears 
in the ratio containing x;, or 22, or x3. Suppose we confine our atten- 
tion to the group in which z, appears in the ratio with z,. We are now 


seeking pn =. But within this group there are two possible cases 
% 7 
according as x, appears in the numerator or in the denominator of the 


ratio containing 2,; and with each of these cases there can be associated 
two cases of the ratio containing x2 and x; depending again on which 
of these latter two is in the numerator or denominator. Thus there 
are four possible cases within each of the three major groups. 

It happens that the correlations of the four cases within any one 
major group are related as can readily be seen by substituting in 
equation (2). The four correlations all have the same magnitude but 
two are positive and two are negative. The signs can be determined 
from the number of interchanges in position in the ratios. An even 
number of interchanges gives a plus sign and an odd number of changes 


gives a minus sign. Thus pn2=—pxm, there being one inter- 
Zs zs Zi zs 


change in position. 


1 Pearson, op. cit. p. 489. 
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Now we can take one of the major groups and confine our attention 
to one pair of the ratios in it. Whatever will be true of this pair of 
ratios, will be true of the three remaining pairs with proper allowance 
for signs. Let us again choose the major group represented by 
fz x and let us introduce a common factor by making the absolute 





man 
values of x, identical with the absolute values of (a) 21, (b) 22, (c) 2s, 


successively. 

(a) Gives an impossible case in that we no longer have pairs of 
ratios to correlate. (Every value for one series of ratios equals unity.) 
This case can be dismissed from further consideration. 

(b) Gives the case of ratios which have a common factor appearing 
in the denominator of one ratio and in the numerator of the other. 

(c) Gives the case of ratios which have the common factor appear- 
ing in the denominators (or numerators) of both ratios. 

In other words, (b) and (c) would yield results like equation (3), 
with due allowance made for the subscripts. 

Now, then, if we assume no correlation between the absolute values, 
we again get the spurious result denoted by equation (4). (Here 
again it is understood that the subscripts are properly allowed for.) 
It follows at once that the common factor will introduce spurious 
correlation not only when it appears in the denominators of both 
indices, but also when it appears in the numerators of both indices, 
as well as when it appears in the numerator of either index and in the 
denominator of the other. Of these four possibilities, two have special 
application to index numbers. When the common factor appears in 
the denominators of the ratios, we have the case of relatives on a fixed 
base. When the common factor appears in the numerator of the 
first index and in the denominator of the second index, we have the 
case of chain relatives. Furthermore, we can see at once from (2) 
that when the common factor appears in either the numerators or the 
denominators of both ratios, the sign of p will be positive, and the 
correlation of the indices will be too high. When the common factor 
appears in the numerator of either index and in the denominator of 
the other, the sign of p will be negative, and the correlation of the 
indices will be too low. In particular, correlation between relatives 
on a fixed base will be too high, while correlation between chain rela- 
tives will be too low. 

There remains for consideration the most practical way of correcting 
for spurious correlation between indices. We have just seen that the 
coefficient of correlation computed for relatives or ratios will be too 
high or too low depending on the way the common factor enters into 
the indices. The amount of spurious correlation and the sign can be 
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determined from (4), but it is inconvenient and unnecessarily laborious 
to use that relation. The gross correlation is obtained directly from 
the indices, whereas (4) gives the correction in terms of the coefficients 
of variation of the absolute measures, and the absolute measures are 
not often readily available when working with relatives. 

I have been able to work out simple formulae for the corrected 
coefficients of correlation, which are independent of the absolute 
values and which involve only the coefficients of variation of the 
indices themselves. They are not given here because they assume 
zero correlation between the absolute values besides substantial 
equality of their coefficients of variation, and I have not been able to 
find any practical examples where such limiting conditions apply. 
Obviously these assumptions will not hold for such ratios as price 
relatives which play so important a réle in economic research. There 
is substantial correlation between prices from year to year, and 
the spread of prices about the mean may vary greatly. The only 
satisfactory way of securing the net correlation in such cases is to elim- 
inate the effect of the common factor by the method of partial corre- 
lation. 

The formula for ‘“‘partialling out” the effect of a common factor in 
the correlation between a pair of ratios is the usual one for going from 
zero order to first order coefficients. But as the application of the 
method of partial correlation to indices is somewhat infrequent, a 
sample is here given for reference. 


Try—Tz, Ty 
2z Zz Zz 








py, 


zz V1-rz, Vi-ry, 
z z 


It may, perhaps, be of interest to work out some ratio correlations 
for a concrete case. Three series were used:! population by states, 
income by states,? and automobile registration by states.* The 
correlations of the absolute measures were computed and then all 
the possible pairs of ratios were correlated. The results appear in the 
appended tables. The ratios are classified into the three major 
groups mentioned earlier, and the symbols used are z for income, 
y for population, and z for auto registration. 

Column (1) of Table I indicates which series or ratios are correlated 

1 I am indebted to Professor F. C. Mills for helpful criticism in the preparation of this paper, and, in 
particular, for suggesting this illustration. 

2 Distribution of Income by States in 1919, by O. W. Knauth, National Bureau of Economic Research. 

3 Facts and Figures of the Automobile Industry, 1919, National Automobile Chamber of Commerce. 


‘4 The five predominantly industrial states, Illinois, Massachusetts, New Jersey, New York, and Penn- 
sylvania, were omitted from the calculations. All figures were reduced by dividing through by 1,000,000. 
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and column (2) states these in symbolic form. The direct or simple 
correlation coefficients appear in column (3), while the partial correla- 
tion coefficients are shown in column (4). In the first three rows of 
the table for the absolute values, the factor held constant is the one 
not appearing in the row. For the rest of the table where the coeffi- 
cients for the ratios are set forth, the factor held constant is the one 
that is common to the two members of the correlated pair. For 
convenient reference the simple correlations for the absolute values are 
repeated in column (5). The values of Table II are required for cal- 
culating the coefficients appearing in column (4) of Table I. 

Study of Table I reveals high correlation between the absolute 
quantities and an almost negligible correction (varying from —.001 
to +.026) for the partial values. From the theoretical considerations 
outlined above, it becomes evident that the amount of spurious corre- 
lation present in correlating ratios depends on the magnitude of the 
dispersion of the absolute measures and on the amount of correlation 
between the absolute measures. With sensible equality of dispersion 
and feeble correlation, the spurious value will approach the value 
.5. On the other hand, still assuming feeble correlation between the 
absolute values, the spurious element may approach the value 1 when 
the dispersion of two of the absolute series is small as compared with 
the dispersion of the third series. (This third series would be the one 
appearing in the numerator of equation (4).) Similarly, if the dis- 
persion appearing in the numerator of equation (4) is small as compared 
with the dispersion of the other two series, the value of p may approach 
zero. In other words, the measure of spurious correlation, varying 
between zero and unity, may have a larger value than that indicated 
by Pearson. At any rate, it is evident that the spurious element will 
be the resultant of the dispersions and the correlations of the absolute 
measures. In many economic applications the spurious element 
may be practically negligible as it is in our illustration, while in organic 
correlations it may be appreciable. The final effect of the interplay 
of the dispersions and the correlations of the absolute quantities can 
be determined only by using the method of partial correlation. 

This paper has stressed the importance of allowing for the fictitious 
value introduced into the final result by the algebraic processes in- 
volved in correlating ratios. It must not be overlooked, however, 
that we may at times legitimately consider the ratios as the variables 
in which we are primarily interested and thus take the index corre- 
lation at its face value. In the words of Yule:! “If the causes, the 


1“*On the Interpretation of Correlation Between Indices,” Journal of the Royal Statistical Society. 
Vol. LXXVII, 1910, p. 644. 
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nature of which we wish to elucidate, influence directly the ratios or 


8 x : ; ; ; 
the indices — and —, or the mode in which these ratios are combined, 
3 v3 
the correlation between the absolute values of the variables zx, and 22 
will be misleading; the correlation should be worked out between 
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TABLE I 
TABLE OF CORRELATIONS 
: Direct Partial Simple 
Series Symbols correlation correlation correlation 
(1) (2) (3) (4) (5) 
et. eee we z and y .935 .840 .935 
Pe SR wcnscaneanen aed .941 .858 .941 
3. Pop. “ Auto. .821 —.491 .821 
Grovp I 
1. Ine. an re = and = .857* — .836 .935 
Auto Pop. z y 
Ine. Pop. zx y 
i > o & P : 
Auto Auto z z 804 679 935 
ff ener gu 8 796 770 935 
Inc. Pop. z y 
4, Auto “ Pop. eecececeseeeeesees 2 oe y 784 — | .935 
Inc. Auto x z 
Grovr II 
1, IBS. ayq Auto = and = 733 727 941 
Pop Pop. y y 
2. Inc. ee Pop. ee ee Zw .796 =—.702 941 
Pop. Auto y z 
3, Pop: « Auto yu dz 785 —.772 941 
Inc. Pop. z y ; , 
4 Pop. ad Pop. eee eeesresreeeeees y ad y 844 841 941 
Inc. Auto z 2 
Grove III 
g, Be eg Bll? .....2. 200000 ? and .400 400 821 
Pop Inc. y z 
Inc Inc. xz z 
2. “ ses «w © —_ 
Pop. pom > ; .473 .467 .821 
3. Pop. “ere ore Yu Zz .456 — .457 821 
Inc. Inc. z z 
4, POP: « I eee ae Yu 2 .603 _598 821 
Inc. Auto z z 




















* The four simple correlations within each major group would all be identical in absolute size, if the 
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values of m Were small, 
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Methods of Computing Seasonal Indexes 


METHODS OF COMPUTING SEASONAL INDEXES: 
CONSTANT AND PROGRESSIVE 


By F. L. CarmicHakt,! Associate Director, Bureau of Statistical Research, 
University of Denver 


By the usual method of eliminating trend and seasonal factors from 
time series, the trend figure for a given date is considered as the base. 
The original data are expressed in terms of the corresponding ordinates 
of trend; and from the percentages thus obtained the seasonal is sub- 
tracted. The seasonal index is tacitly assumed, therefore, as the 
indicated percentages of the trend. 

Of the methods generally employed in the computation of seasonal 
indexes, the ratio-trend method ? is the only one that makes direct use 
of this relationship. An appropriate average of the ratios of the 
original data to trend is taken for each month, and the resulting crude 
indexes are adjusted by means of a ratio change to an average of 100 
per cent. 

The purposes of this article are: (1) To outline a method of employing 
the first and second differences of the ratios of the actual to the cor- 
responding trend values in the computation of seasonal indexes and 
to make comparisons, under test conditions, of results obtained; (2) 
to indicate a device by which link and chain relatives may be used 
when the original data contain both positive and negative items; (3) 
to suggest a modification of the ratio-trend. method; (4) to indicate an 
application of the method of differences, with comparisons, to the 
problem of progressive variation in seasonality. 


I. ON METHODS OF COMPUTING SEASONAL INDEXES ASSUMED CONSTANT 


Method of First Differences. The first difference of the ratios to 
trend for a given month may be defined as the ratio for the given month 
minus that for the preceding. The method of computation may be 
outlined as follows: 

(a) The original data are divided by the corresponding ordinates 
of trend. 

(b) First differences of these ratios are obtained in accordance with 
the above definition. 


1 The writer is greatly indebted to the following: Professor John H. Cover, for valuable criticisms and 
suggestions; the Bureau's Fellows in Economic Research, Graham Evans, Albert Butler, E. H. Welch, 
Melvin Anderson, H. 8. Davis, Herbert Hoogstrate, Reuben Horton, and W. A. Peck, for assistance in 
the computation. 

? Helen D. Falkner, ‘‘The Measurement of Seasonal Variation,’’ this Jounnau, Vol. XIX (1924), pp. 
167-179. 
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(c) A multiple frequency table of these differences is prepared and 
an appropriate average! for each month determined. 

(d) The seasonal index is computed from these monthly averages 
by a reverse first-difference process outlined in Table I. 


TABLE I 
COMPUTATION OF THE SEASONAL INDEX—METHOD OF FIRST DIFFERENCES 




















Average First differences | First differences : 

Month first differences adjusted chained oe 
(1) (2) (3) (4) (5) 
Ns ck 0k 5 actrdrona ere ena +0.21 +0.20 1.00 130 
IN on. aw eae arnerasond — .14 — .15 .85 115 
ca wieeuewcawkued + .06 + .05 .90 120 
eats anata Coen — .19 — .20 .70 100 
SR ne err eee — .19 — .20 .50 80 
ES Se ee ee — .09 — .10 .40 70 
ie ch habe Abe dadeee + .21 + .20 .60 90 
a ad cis eae ead s — .04 — .05 .55 85 
os eee cals + .16 + .15 .70 100 
ed inn ta 0 ch ncecen en eine — .04 — .05 .65 95 
ee + .11 + .10 .75 105 
dhs cineneunedonsee + .06 + .05 .80 110 
ee ae Re +0.12 0.00 8.40 1200 
Averages.............e0ee: +0.01 0.7 | 100 














Generally, the algebraic sum of the average first differences, column 
(2) above, will not be zero. An adjustment is made so that this is the 
case by increasing or decreasing each figure by a constant amount, to 
obtain column (3). A convenient month is now chosen as a base, or 
starting point, and the adjusted first differences chained to it. In 
column (4) January was so chosen. Since the average of these figures 
is not 100 per cent, they are now increased or decreased as may be 
necessary by the difference between this average and 100 per cent. 
Column (5) so obtained is the seasonal index. 

Method of Second Differences. The second difference of the ratios 
to trend for a given month may be defined as the first difference of 
these ratios for the given month minus that for the preceding. The 
general aspects of the method of computation are identical with those 
of the first-difference method, the additional steps required being 
indicated by the definition of the second difference. Monthly average 
second differences are computed by means of a multiple frequency 
table. The seasonal index is computed from these averages by twice 
reversing the first-difference process. The details of the computation 
are indicated in Table II. 


1 The median or modified median. The term “modified median” is used in this article to indicate the 
arithmetic average of the three or four mid-items, depending upon whether the number of items aver- 
aged is odd or even. 
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TABLE II 

COMPUTATION OF THE SEASONAL INDEX—METHOD OF SECOND DIFFERENCES 

Average Second Second First First Sescenel 
Month second differences | differences | differences | differences Sodion 

differences | adjusted chained adjusted chained (per cent) 
(1) (2) (3) (4) (5) (6) (7) 
 cccnetnee +0.16 +0.15 0.00 +0.20 1.00 130 
SS  PRRORT ECE Ce — .34 — .35 — .35 — 15 .85 115 
Ds o+sse6eeeuwe + .21 + .20 — .15 + .05 .90 120 
cone a cow nen — .24 — .25 — .40 — .20 .70 100 
Serre + .01 .00 — .40 — .20 50 80 
Lie &iewn dees + .1l + .10 — .30 — .10 .40 70 
eer re + .31 + .30 .00 + .20 60 90 
Fer rrr. — .24 — .25 — .25 — .05 .55 85 
September.......... + .21 + .20 — .05 + .15 70 100 
ss gs eaneinne — .19 — .20 — .25 — .05 65 95 
November.......... + .16 + .15 — .10 + .10 .75 105 
Decemeber... 2. es — .04 — .05 — .15 + .05 80 110 
0 eee +0.12 0.00 —2.40 0.00 8.40 1200 
AVOTAGOB. . 0... ccces +0.01 —0.20 0.70 100 


























The average second differences, column (2), require adjustment so 
that the average of the twelve monthly figures shall be zero, as indi- 
cated in column (3). To transform to the first-difference basis, the 
adjusted second differences are chained to a convenient month as 
a base, January being so chosen in column (4). The steps in the com- 
putation from this point are identical with those of the first-difference 
method, second differences chained being in effect first differences, so 
that column (4) of Table II is analogous to column (2) of Table I. 
The figures of column (5), obtained by adjusting column (4) so that 
the average is zero, are chained to a convenient month as base in col- 
umn (6). This column, increased or decreased as may be necessary 
to bring the average to 100 per cent, yields the seasonal index, 
column (7). 

Choice of Method: General Considerations. Because of the influence 
of the cycle upon the ratios to trend, accurate determination of the 
seasonal index by the ratio-trend method is possible only when data 
covering a relatively long period are available. Denoting the ratios 
by ri, 72, 73,. . . , the first differences take the form r;—7rj-1; the second 
differences, r;—2ri-1+ri-. In case there is but little random fluctua- 
tion in the series, it is evident that the grouping in the multiple fre- 
quency table of first differences is better than that of the ratios them- 
selves. Thus, the divergence in any month cannot be greater than 
twice the maximum change in cyclical position from one month to the 
next, positive and negative cyclical tendencies producing opposite 
effects upon the first differences. In the case of the second differences 
the effect of smooth cyclical swing is negligible; the grouping in the 
multiple frequency table is primarily dependent upon the amount of 
random variation present. 
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The characteristics of the multiple frequency tables and a considera- 
tion of the errors in the computed seasonals resulting from given errors 
in the monthly measures afford criteria as to choice of method in a 
given case. Generally speaking, it appears that the maximum dis- 
crepancies between the true and the computed seasonals resulting 
from errors of like amounts in the typical monthly measures by the 
three methods (ratio-trend, first-difference and second-difference) are 
approximately in the ratio 1:1:2.!_ Unless the random fluctuation is 
excessive, the grouping of the first differences is better than that of 
the ratios to trend. In general, therefore, it appears that the former 
method will yield the greater accuracy. In the writer’s opinion, this 
is of considerable importance when the period covered by the data is 
short. In exceptional cases the greater refinement of the second- 
difference method may be warranted. 

If the usual assumption as to seasonal is correct, a given link relative 
is large (or small) depending upon whether its base adjusted for sea- 
sonal is below (or above) the trend. Thus, for consecutive seasonal 
index figures of 120 per cent and 132 per cent, the normal link relative 
for the second month is 110 per cent. If constant cyclical positions 
of 80 per cent and 120 per cent of the trend are assumed, the link rela- 
tives have the values 112 per cent and 108.6 per cent, respectively. 
This tendency toward error in opposite directions, when the base is 
above or below normal, may be urged as an objection to the method. 
In this respect the use of differences of the ratios to trend is the more 
logical. 

Series with Positive and Negative Items: Link- and Chain-Relative 
Method. Ina series such as telephone station net gains, both positive 
and negative items may appear. Obviously for such a series the usual 
link- and chain-relative method is not valid. While no difficulty is 
encountered by the ratio-trend and difference methods, it is desired 
to present a device by which link and chain relatives may be used. 
The steps in the computation may be outlined as follows: 

(a) The ratios to trend are increased by a constant amount greater 
than the numerical value of the smallest ratio. Preferably, the small- 
est item of the new series should be considerably above zero. 


1 The following theorems are pertinent: (1) If an error e is made in the computation of one of the 
monthly measures by each of the three methods (ratio-trend, crude indexes adjusted additively; first- 
difference; and second-difference), all others being accurately determined, the maximum errors in the 
computed seasonals are 1le/12, 1le/24, and 143e/144, respectively. (2) If all measures are in error 
in the most unfavorable manner by amounts numerically equal to e, the maximum errors in the com- 
puted seasonals by the three methods in order are 1le/6, 3e, and 343e¢/72. 

Additive adjustment of the crude indexes of the ratio-trend method is discussed in subsequent para- 


graphs of this article. The errors noted above appear to be comparable in most cases with those ob- 


tained when the adjustment is made by a ratio change. 
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(b) Link relatives of this series are computed and, by means of a 
multiple frequency table, monthly averages determined. 

(c) The average link relatives are corrected for circular error and 
chained in’ 2 usual manner, to obtain the seasonal index of the series 
resulting from step (a). 

(d) An adjustment is now made placing this index on the original 
data level as follows: 

If the average of the series resulting from step (a) is k times that of 
the ratios themselves and s;’ is the seasonal found by step (c), the sea- 
sonal s; of the original series is obtained by use of the formula 


si: =1+k(s;’—1). 


If the seasonal and trend computations are based on identical periods, 
assuming that the latter is made by the method of least squares, the 
average of the ratios to trend will be approximately 100 per cent. In 
case each figure is increased by 200 per cent, under these conditions, 
the value of kis 3. If step (c) yields an index figure of 110 per cent for 
a given month, the true seasonal is 130 per cent. 

Adjustment of Crude Indexes: Ratio-Trend Method. As indicated 
above, the proposed ratio-trend method requires adjustment of the 
crude indexes by means of a ratio change. The average of the crude 
indexes is computed and each of the twelve figures divided by it. 

The necessity for the adjustment results from errors inherent in the 
method of computing the monthly averages. Under the assumption 
of uniformity in grouping (as between different months) in the multiple 
frequency table, it appears that the margin of error should be the same 
for the large index figures as for the small. If an error of 2 per cent of 
the normal monthly average is possible in the one case, it will be 
possible in the other. If this view is correct, the adjustment should 
be additive rather than a ratio change. Examples are presented in 
subsequent paragraphs of this article which seem to corroborate this 
contention. 

Methods Compared under Test Conditions. In order to compare 
results by different methods, an hypothetical series ' was constructed 
covering a twenty-two and one-half-year period with known trend, 


1Expressible mathematically by the equation y = (2000 +360z/m)s +1000 sin zx; January, the first 
month of the data, at the origin; monthly unit on the z-axis, 7/45, so that the monthly trend increment 
is 8, and the cycle is seven and one-half years in length; seasonal index, s, January to December in order: 
130, 110, 120, 110, 90, 100, 80, 75, 85, 100, 90, 110 per cent. 

In the development of the series and the computation of the indexes uniformity of procedure was 
emphasized. Ratios to trend, first and second differences of these ratios and link relatives were carried 
to tenths of per cent. Modified medians were used throughout, averages of the three or four mid-items 
in the multiple frequency tables being determined with as great accuracy as the actual figures would 
permit. This refinement, though introducing fictitious accuracy perhaps, was employed to eliminate 
possible discrepancy from this source. 








344 American Statistical Association [74 


seasonal, and cyclical movements. Seasonal computations were made 
by the four methods for the whole and for selected parts of the period. 
The maximum discrepancies between the computed and the assumed 
indexes are summarized in Table III. 


TABLE III 


MAXIMUM DISCREPANCIES BETWEEN COMPUTED AND ASSUMED SEASONAL 
INDEXES—SERIES WITH CONSTANT SEASONAL 














Length of period in years 
Method 
22% 20% 18% 15 13 11 
(1) (2) (3) (4) (5) (6) (7) 
EEO PO 0.07% | 0.04% | 0.07% | 0.07% | 0.11% | 0.04% 
RR ARE Re Ses .08 .15 . 36 .17 my .61 
Link- and chain-relative................ 8 &, 2.5 4 2.6 5.1 
Ratio-trend—crude indexes adjusted: 
ee nn. 5 nce veseecsse 2.1 3.8 4.2 1.7 3.3 6.2 
. eer 2.0 2.5 1.9 1.7 2.8 2.4 


























The ‘zones of distribution” in the multiple frequency tables accord 
well with the results of the computation. The second differences 
show the best grouping, the maximum zone depth being less than one 
per cent; the seasonals computed by this method agree most closely 
with the assumed. Similarly, frequency tables of first differences and 
of link relatives evidence variations for individual months of 7 per cent 
and 23 per cent, respectively. For the ratios to trend, the zones of 
distribution approximate 80 per cent in depth. With reference to the 
contention that the crude indexes should be corrected additively, it 
may be noted also that the zone depths for the months of large and of 
small index figures are practically identical, indicating under the condi- 
tions imposed that the margin of error in terms of the trend is the same 
in the two cases. 

Since the cycle is seven and one-half years in length, the data for 
columns (2), (3), and (4) cover three cycles, three cycles less two 
years, and three cycles less four years, respectively; for columns (5), 
(6), and (7), two cycles, two cycles less two years, and two cycles less 
four years, respectively. The six periods have a common starting 
point, January of the first year; there are no gaps in the data. It may 
be seen, therefore, that the data for columns (2) and (5), (3) and (6), 
and (4) and (7) end on the same phase of the cycle. Comparison of 
results should be made with this in mind. 

Best results are obtained when the beginning and end of the data 
fall on the same phase of the cycle, columns (2) and (5). Variation in 
the discrepancies by the second-difference method is not significant, 





a n+ a — - 
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results being as close in all cases as could well be expected, since the 
ratios to trend were carried to tenths of per cent only. By the first- 
difference method, discrepancies are smaller for the long than for the 
corresponding short periods. The same is true for the most part with 
the link- and chain-relative, but not with the ratio-trend method. This 
opposite tendency is doubtless due to the fact that data (for certain 
months) for two cycles are more nearly balanced with reference to the 
trend (or zero) line than is the case for three cycles. Thus, in fifteen 
years, November falls above the zero line seven times, below eight 
times; in the twenty-two and one-half years, above ten times, below 
twelve times. The effect upon the crude indexes of the ratio-trend 
method is obvious; in view of the tendency toward error in opposite 
directions in the link relative, when the base is above or below normal, 
slight discrepancies may be expected to result when link and chain 
relatives are used. 

Additive correction of the ratio-trend crude indexes yields decidedly 
better results than the ratio change. 

Illustrative of the method outlined for computing the seasonal when 
both positive and negative items are present, a series! with known 
trend, seasonal, and cyclical movements covering a period of fifteen 
years was used. The ratios to trend were increased by 200 per cent, 
so that the value of k in the above formula is 3. Indexes were com- 
puted by the modified link- and chain-relative, the first-difference and 
the second-difference methods, yielding maximum discrepancies 
between the assumed and the computed indexes of 0.3, 0.5, and 0.06 
per cent, respectively. 


II. ON THE PROBLEM OF PROGRESSIVE SEASONAL VARIATION 


Methods have been proposed by which ratios of the actual to the 
corresponding trend values? and link relatives * are employed in the 
measurement of progressive seasonal variation. The average tend- 
ency or trend of the ratios (or link relatives) for each month is deter- 
mined by an appropriate method, for example, the method of least 
squares, and typical! ratios (or link relatives) for given years (or dates) 
computed. By the ratio-trend method, no difficulty is encountered in 


1 Expressible mathematically by the equation y = (600 +90z/m)s +1000 sin (x +z); January, the first 
month of the data, at z =2/90; monthly unit on the z-axis, 7/45, so that the monthly trend increment 
is 2, and the cycle is seven and one-half years in length; the factor, s, indicating multiplication by the 
appropriate index figures, January to December in order: 140, 110, 120, 100, 90, 70, 80, 60, 90, 100, 110, 
130 per cent. 

20. Gressens, ‘‘On the Measurement of Seasonal Variation,” this Journat, Vol. XX (1925), pp. 
203-210. 

3W. L. Crum, “Progressive Variation in Seasonality,’’ this Joornau, Vol. XX (1925), pp. 48-64. 

4In this and succeeding paragraphs an ordinate of the trend of monthly measures is referred to as 
a typical or computed measure. 
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the computation of indexes for specified years; by the proposed link- 
and chain-relative method, typical link relatives are computed for 
July 1 of each year and the resulting seasonals assumed to be suffi- 
ciently close approximations to the true seasonals of those years. 

For both the ratios to trend and the link relatives, the average 
monthly tendencies are materially affected by the swing of the cycle. 
Ordinarily, the slopes of the trend lines are not true measures of the 
changes in seasonal. Under the assumption of smooth cyclical move- 
ment this difficulty is, to a degree, eliminated by the use of first differ- 
ences. The effect of cyclical swing upon the second differences is 
negligible. Decision as to which method will yield the best results in 
a given case will have to be made in the light of the amount of random 
variation present. While it is impossible perhaps to formulate a 
definite criterion, it appears that first differences are preferable to 
ratios to trend or link relatives in most cases. Since random fluctua- 
tions are magnified, so to speak, in the second differences and the com- 
putation is more laborious, its use should perhaps be limited to ex- 
ceptionally smooth series. 

The proposed method of centering the link relatives within the year! 
appears to be unsound. When progressive seasonal change takes 
place a link relative, computed in the usual manner, cannot be con- 
sidered as centering between the two months, nor is it a normal link 
relative as of either month. To compute a February link relative as 
of the given February, for example, account must be taken of the effect 
of seasonal change upon the January figure during the month that has 
elapsed. The theoretical aspects of the problem may be explained 
best perhaps by reference to Table IV. 

Columns (2) and (3) are hypothetical seasonal indexes for consecu- 
tive years. Straight-line changes in seasonal are assumed for all 
months, so that column (4), the seasonal index for the year one month 
later than that of column (2), is obtained by taking one-twelfth of the 
differences between corresponding index figures and adding the result 
to the respective 1920 figures. 

To obtain column (5), 1920 and 1921 link relatives were computed by 
dividing each figure by the preceding. (Since the December seasonal 
remains constant, the January, 1920, link relative is 130+110). Under 
the assumption that these ratios are representative link relatives as of 
their respective months and that straight-line changes take place 
therein, interpolation was made for link relatives as of December 15, 
1920. The usual seasonal computation based upon the results so 
obtained yields the seasonal index recorded in column (5). 


1 Cf. loc. cit., p. 61. 
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TABLE IV 
HYPOTHETICAL PROGRESSIVE SEASONAL CHANGE: CENTERING OF LINK RELATIVES 
































| 
Seasonal indexes by 
. in link- and chain-relative 
Hypothetical Seasonal | method for Dec. 15 
seasonal indexes for index * for 1920. 
the years the year ‘Crude * Col. (7) 
Month Feb., 1920, | eg adjusted to 
ec. 15, average 
f~.~ i As 1920 100% 
: formerly | explained 
1920 1921 proposed below 
(1) (2) (3) (4) (5) (6) (7) (8) 
EE 130% 131% 130.08% | 127.18% 129 .14% 130.92% | 129.39% 
February....... 110 112 110.17 108. 55 110.11 111.67 110.14 
NS cao @ 120 123 120.25 119.00 120.50 122.25 120.72 
April... .. 110 114 110.33 109 . 87 111.00 112.67 111.14 
May 90 95 90.42 90.86 91.50 92.92 91.39 
June 100 106 100.50 101.16 101.38 103 .00 101.47 
SCR re 80 87 58 81.80 81.56 82.92 81.39 
August. ouia 75 83 75.67 7.13 76.34 77 .67 76.14 
September 85 94 85.75 87.39 85.73 87 .25 85.72 
October. ....... 100 110 100.83 102.7. 99 . 87 101 . 67 100.14 
November...... 90 35 85.42 87 .46 84.32 85.42 83.89 
December...... 110 110 110.00 106.88 108.55 110.00 108.47 
Averages. ...... 100 | 100 100.00 100.00 100.00 101.53 100.00 

















* Based upon the assumption of straight-line changes in seasonal within the year for all months. 


The normal February link relative as of February 15, 1920, is ob- 
tained by dividing the February, 1920, index figure by that for January 
as of the same date, that is, as shown in column (4). Relatives for all 
months of 1920 and of 1921 were computed in this manner; and 
straight-line interpolation was made to obtain normal link relatives as 
of December 15, 1920. The usual seasonal computation based thereon 
leads to the index figures shown in column (6). 

Column (7) is obtained by straight-line interpolation between (2) 
and (3), all figures being computed as of December 15, 1920. Additive 
adjustment of column (7) to an average of 100 per cent leads to column 
(8). 

As the writer understands the proposal in regard to centering the 
link relatives, column (5) compared with (2) and (3) is illustrative of 
the errors inherent in it. A discrepancy of 3.1 per cent is indicated in 
the December seasonal, assumed to be unchanging; index figures for 
some of the other months do not fall between the respective 1920 and 
1921 seasonals. Moreover, the computation of a seasonal as of a given 
date appears not to be logical. Column (7) obtained by straight-line 
interpolation between (2) and (3), as above explained, does not average 
100 per cent. When the adjustment recorded in column (8) is made, a 
discrepancy of 1.5 per cent in the December figure results. The small 
variations betwen (6) and (8) are doubtless largely due to the fact that 
a straight-line change in the link relatives does not result in a straight- 
line change in the seasonals. It is interesting to note that a method 
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analogous to that for column (6), using first differences instead of link 
relatives, yields results identical with those shown in column (8). 

While there is no theoretical difficulty perhaps in adjusting a time 
series so that link relatives may be computed as of given dates, the 
computation required would appear to be laborious. If typical link 
relatives are computed for a given year, no attempt at centering 
being made, and the year’s seasonal is based thereon, the results 
will be only slightly in error in most cases. In the writer’s opinion, 
the method is correct theoretically provided the last month of the year 
undergoes no seasonal change and in any case is preferable practically, 
if not theoretically, to that of centering the link relatives. The prob- 
lem is further discussed in subsequent paragraphs of this article. 

For comparison of results by the four methods (ratio-trend, first- 
difference, second-difference, and link- and chain-relative), marriage 
licenses issued in Denver ' for the period January, 1900, through Octo- 
ber, 1926, have been chosen.? 

A straight-line trend was fitted to annual data, by the method of 
least squares, for the years 1900 through 1925 and reduced to a monthly 
basis by dividing the coefficients by appropriate constants. Ratios to 
‘rend, first and second differences of these ratios, and link relatives of 
the original data were computed. The tendencies of these measures 
for each of the twelve months were found graphically to approximate 
straight lines throughout the period. In order to eliminate the per- 
sonal equation, straight-line trends were fitted to the monthly measures 
by the method of least squares and terminal (1900 and 1926) values 
computed therefrom. 

As stated above, no difficulty is encountered in the computation of 
indexes for specified years when ratios to trend are employed. The 
monthly ratios are computed as of the respective months, and the re- 
sulting crude indexes adjusted to an average of 100 per cent. Clearly, 
no correction is required as a consequence of seasonal change since a 
given ratio is dependent upon the actual and corresponding trend 
values only. 

If first differences are used and an index for the year beginning with 
January,’ for example, is desired, the computed January first differences 


1 For original data, see University of Denver Business Review, Vol. III, No. 1 (January, 1927), p. 29. 

2 Other methods were considered, but for various reasons discarded. For example, the method out- 
lined by W. I. King, ‘‘An Improved Method for Measuring the Seasonal Factor,” this JourNat, Vol. 
XIX (1924), pp. 301-313, was discarded since the personal equation is introduced in the computation 
in such a manner as to affect the results materially. 

3 It should be noted that the total of twelve consecutive monthly figures of an index of progressive 
seasonal change may differ from 1200 per cent. Thus, assuming the total of the figures for the twelve 
months, January to December, inclusive, to be 1200 per cent and the January index to be decreasing 4 
per cent each year, it is evident that the twelve figures beginning with the intermediate February will 
total only 1196 per cent. It follows, therefore, that seasonal computations should be made for calendar 
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must be corrected. Assuming the December index figure to be de- 
creasing 2 per cent of the normal monthly average from year to 
year, it is obvious that the January figure determined from the trend 
fitted to the January first differences will be 2 per cent too small. 

To ascertain the amount of the year-to-year change in the December 
seasonal, it is necessary merely to apply the reverse first-difference 
process to the annual changes in the typical monthly first differences. 
By adjusting the last column of the computation to an average of zero, 
instead of 100 per cent as explained in section I, the annual change in 
each month’s index figure is obtained directly, plus if increasing from 
year to year, minus if decreasing. The January first difference of the 
year-to-year change in the December index figure and of zero assumed 
for January is added algebraically to the January computed first differ- 
ence. The reverse first-difference process applied to the first differ- 
ences computed as of their respective months, with the January figure 
corrected as indicated, yields the index as of the given year. 

When second differences are employed the computed figures for the 
first two months must be corrected, since the second difference for a 
given month is a function of the ratios to trend for it and the two 
preceding months. The reverse second-difference process applied to 
the annual changes in the monthly measures yields the year-to-year 
changes in the index figures. The effect of progressive seasonal change 
upon the January and February second differences is found by taking 
the second differences of the November and December annual changes 
in seasonal together with zero figures assumed for January and Febru- 
ary. These second differences (for January and February) are added 
algebraically to the corresponding computed measures. The seasonal 
computation, using typical second differences as of the respective 
months (after correction of the January and February measures), 
yields the index for the year. 

The computation of the seasonal for a given year by the link- and 
chain-relative method is not so simple. Two approximate methods are 
suggested: (1) Typical link relatives are computed for the given year as 
of their respective months and for one year later. The annual change 
in the December index figure is indicated by a comparison of the two 
seasonals obtained from the usual computation with these two sets of 
link relatives. If the December seasonal is decreasing from year to 
year, the base for the January link relative is obviously too large, so 





years only, provided 100 per cent is desired as the average of a given (calendar) year’s index figures. 
Interpolation between an index computed for a calendar year at one point in a time series and one for 
the year April through March, for example, at another would lead to erroneous results. In the dis- 
cussion and illustrative examples that follow, the index figures for any calendar year are assumed to 
average 100 per cent. 
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that the computed measure for this month must be increased. If 
8 and cy» are the December seasonal for the given year (as above 
determined) and its year-to-year change, respectively, the base for the 
January measure is s1.—Ci; it must be changed to sy». Hence, the 
adjustment factor to be applied to the computed January link relative 
is (Siz—Ciz2)/Si2. The adjusted January link relative and those com- 
puted from the trend equations for the remaining months will yield, 
through the usual seasonal computation, the desired index. (2) The 
year-to-year changes in the link relatives obtained from the monthly 
trend computations are added algebraically to 100 per cent. The 
resulting figures are treated as link relatives; and the usual seasonal 
computation is made. The index figures so obtained less 100 per cent 
are approximations to the year-to-year percentage changes in the index 
figures for the corresponding months. The December seasonal of the 
given year, taken as 100 per cent, is the true base for the January link 
relative. Actually, the base is the December figure for the preceding 
year, that is, 1 minus the year-to-year percentage change in the Decem- 
ber seasonal. If ci: is this percentage change, the adjustment factor to 
be applied to the January computed link relative is (1—cy). 

Method (1) is thought to be the more exact. From a theoretical 
point of view perhaps the only objection is the fact that the January 
link relatives used in the first two seasonal computations are subject to 
adjustment. The effect of this upon the year-to-year change in the 
December seasonal is doubtless entirely negligible. Pratically, the 
objections are great, since three seasonal computations are required to 
determine the true index for a given year. If the monthly trends are 
straight lines, results of the first seasonal computation of method (2) 
may be used for correction of the January link relative for any year; 
after it has been made one seasonal computation for each year is suffi- 
cient. Method (2)! is used in subsequent paragraphs of this article. 

Terminal indexes (by the four methods) of marriage licenses issued 
in Denver are recorded in Table V. 

It is interesting to note that 3 per cent is the maximum variation 
between corresponding index figures by the four methods. Results by 
the difference methods differ by 1 per cent only. Perhaps the most 
striking discrepancy appears in the January figures, an increase by the 
ratio-trend method, a decrease by the difference methods. 

The relative increase in marriages during the summer months is 


1 The year-to-year changes in the seasonal of marriage licenses issued in Denver, obtained by con- 
verting the raw percentages of method (2) to the normal monthly average as a base, agree closely with 
those shown by the respective terminal (1900 and 1926) indexes recorded in a later paragraph. Appli- 
cation of the raw percentages to the seasonal for the mid-year of the period yielris results differing from 
them by less than 0.07 per cent on an average; the maximum discrepancy for a given month is 0.11 per 


cent. 
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TABLE V 
MARRIAGE LICENSES ISSUED IN DENVER—TERMINAL INDEXES OF SEASONAL 
VARIATION 
Second First Ratios to Link and 
differences differences trend * chain relatives 
Month 

1900 1926 1900 1926 1900 1926 1900 1926 

EE. oan sne wines 86% 85% 86% 85% 85% 87% & J 85% 
February........... 81 73 81 73 80 74 Se 73° 
n'a da aa 77 70 77 69 76 70 77 69 
EE a Gnebikn iocew oars 92 84 92 83 92 84 93 83 
Ne OG eat ale ace’ 80 95 81 94 80 94 82 93 
sais aia ata eave arate 145 154 146 154 146 153 147 156 
ae 90 104 90 104 a) 103 90 105 
0 Se 92 125 92 125 92 124 92 127 
September.......... 118 114 118 115 119 113 117 114 
aa wa anipaeal 116 96 115 97 116 96 115 96 
ee 105 94 104 95 105 95 105 93 
December........... 118 106 118 106 119 107 117 106 





























* Crude indexes adjusted additively; seasonals computed by ratio adjustment practically the same, 
since averages of crude indexes differ but little from 100 per cent. 





worthy of note. May, June, July, and August evidence increases over 
the period from eight to thirty-three per cent of the normal monthly 
average; all other months, with the possible exception of January. show 
decreases. 

Over the twenty-seven-year period, indexes by any of the four 
methods are doubtless accurate enough for all practical purposes. In- 
quiry may well be made as to how the results compare if the series 
covered only six or eight years. It was therefore decided to take the 
data for marriage licenses from January, 1919, to date and compute the 
1926 indexes for comparison with the 1926 seasonals obtained above. 
Using the first-difference results as the standard, maximum discrepan- 
cies by the ratio-trend, first-difference, second-difference, and link- and 
chain-relative methods are 13,! 6, 12, and 17 per cent, respectively. 

As a further test of method a series? with known trend, progressive 
seasonal and cyclical movements covering a fifteen-year period was 
employed. Terminal indexes computed by the four methods vary 
from the assumed as indicated in Table VI. 

The results again appear to favor the difference method. Indexes 
computed by the use of second differences are strikingly close to the 
assumed. Results by the ratio-trend method (crude indexes adjusted 


1 Crude indexes adjusted additively; by ratio change, 19 per cent. 

2 Expressible mathematically by the equation y = (2000 +360z/2)s +1000 sin z; January, the first 
month of the data, at z = 7/90; monthly unit on the z-axis, 7/45, so that the monthly trend increment is 
8, and the cycle is seven and one-half years in length; the factor, s, indicating multiplication by appro- 
priate percentages, January to December in order: 130 —3n, 110 +n, 120 —n, 110 +2n, 90 +3n, 100 —n, 
80+1.5n, 75+4n, 85—n, 100 —2n, 90 —1.5n, 110—2n, where n=0,1,.. . , 14, for the first, second, 
..., fifteenth years, respectively. Monthly data and the seasonal indexes are recorded in tables 
VII and VIII. 
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TABLE VI 


MAXIMUM DISCREPANCIES BETWEEN COMPUTED AND ASSUMED TERMINAL 
INDEXES—SERIES WITH CONTROLLED PROGRESSIVE SEASONAL 








Method Maximum discrepancy 





6 et ee ee la oa ae eee ae en Aa 
EERE EIR Re Ne ee Sa ave tone 
i ec aane es Rae asia se heewt ea eagenn 
Ratio-trend crude indexes adjusted: 
ee ee ai oe a aSeeaeku wen ewaweN 
CT ee ee iw abt eke RIG .13 








additively) compare favorably with those obtained from the first 
differences. Were the terminal points of the series not on the same 
phase of the cycle, there would doubtless be a greater advantage in 
favor of the first-difference method. The importance of the additive 
correction of the ratio-trend crude indexes is emphasized. 

To eliminate the personal equation from the computations and hence 
make the comparisons more valid, the method of least squares was 
used throughout in the determination of the trends of the monthly 
measures. In case individual judgment is exercised, exclusion of 
random items is doubtless easier by the difference than by the link- and 
chain-relative or ratio-trend methods. 

By the ratio-trend and difference methods, straight-line trends fitted 
to the monthly measures yield straight-line seasonal changes. Know- 
ing the terminal indexes, interpolation for intervening years is imme- 
diate. Compared with the link- and chain-relative method, which 
does not enjoy this property, the practical implications are obvious. 


III, CONCLUSIONS 


Theoretical considerations and numerous tests reported in sections 
I and II apparently warrant the conclusion, provided the seasonal 
index is (as usually assumed) the indicated percentages of the trend, 
that the method of first differences of the ratios to trend yields in 
general a more accurate determination of the seasonal index, either 
constant or progressive, than that of ratios to trend or of link and 
chain relatives. 

By a method based upon absolute increases of the ratios to trend, 
link and chain relatives may be used when positive and negative items 
are present. Telephone station net gains may be cited as an illustra- 
tive series. 

The crude indexes of the ratio-trend method should be adjusted (in 
order to average 100 per cent) by an additive rather than by the ratio 
change formerly suggested. This becomes a matter of consider- 
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able importance when progressive variation in seasonality is being 
measured. 

The computation of progressive seasonal indexes for given years is 
simpler by the ratio-trend and difference methods than by the use of 
link and chain relatives. Moreover, the centering of link relatives 
within the year as formerly proposed is not sound. 
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THE NEW ORGANIZATION OF THE STATISTICAL 
SERVICES IN ITALY 


By L. Gatvant, Istituto Centrale di Statistica del Regno d'Italia ' 


The spirit of expansion which is manifest at present in all Italian 
political, economic and administrative life has recently been extended 
to the reorganization of statistical services which have been placed on 
a new basis directly under M. Mussolini, head of the Government. 
This proves definitely the position in present-day Italy accorded to 
this branch of public administration; and also the opportuneness of 
the recovery from the neglect into which it had unfortunately fallen 
after the retirement of the late M. Bodio (1900), who had lavished on 
it the greatest care and had created for it a place of great honor. 
The appointment of M. Gini to the Presidency of the new Central 
Institute of Statistics (Istituto Centrale di Statistica del Regno 
d’Italia) is, in itself, a guarantee that it will soon regain the ground 
lost and attain a high place among the most advanced nations. 

For the story of the development of official statistics in Italy and of 
the conditions which have surrounded them, we are fortunately not 
obliged to confine ourselves to the tumultuous happenings of the last 
twenty-five years. On the contrary Italian statistics can boast a 
tradition which is not without glory. The very word “‘statistics’’ is 
derived from the Italian ‘‘stato,” and the first demographic censuses 
which had a fair degree of reliability were taken at the command of the 
Roman emperors. Further, it must be recognized that, of the states 
of Europe, Italy was one of the first to organize a statistical bureau. 
In fact, the one in Italy was instituted in 1807 under the direction of the 
eminent statistician and philosopher Melchiorre Gioia, shortly after 
that created in France by Lucien Bonaparte (1800), and before those 
of Austria (1828), of Belgium (1831), of England (1832), and of Den- 
mark (1833). However, this Italian bureau was short lived and fol- 
lowed the ebbing fortunes of Napoleon. Thereafter many of the 
small states into which Italy was divided had their census bureaus. 
We find in 1832 at Palerma the Central Direction of Statistics for the 
island of Sicily; in 1836 at Turin the Central Commission of Statistics 
for the Piedmont; in 1849 in Tuscany a Bureau of General Statistics; 
and other bureaus in 1851 at Naples for the Neapolitan provinces, and 
in 1858 at Rome for the States of the Church. 

1 Translated by the Editor from the French. 
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It was not until after the establishment in 1861 of the present King- 
dom of Italy that the Central Bureau of Statistics, later called the 
General Direction of Statistics, was created. This central office, 
assisted by a Superior Council of Statistics, was maintained until July, 
1926. Up to 1872 it was under the direction of Maestri, and from then 
to 1900 of Bodio. It was particularly during the latter period that 
Italian statistics attained world renown for exemplary arrangement and 
for the extensiveness and detail of the work which was developed. 
Since 1874 te Annali di Statistica has been published and since 1878 
the Annuario Statistico Italiano. But to publications concerning the 
fundamental statistics of population changes, causes of death, and 
census enumerations, were added also (well in advance of other nations) 
special inquiries into many branches of the national economy, mono- 
graphic studies of the Italian regions, scientific researches in the 
presentation of statistical phenomena, of tables of mortality and of life 
tables, and especially international comparisons of great interest. 
It is regrettable that since 1900 this more worthy form of activity has 
been neglected and that Italian official statistics have left to individuals 
the task of elaborating results and, as far as they could, of contributing 
to science and the national life. However, this has not been entirely 
unfortunate, for in that period the number of experts in statistics 
increased, and to the names of Melchiorre Gioia, Messedaglia, Ferrara, 
and Bodio many others were added, who for the extent, profundity and 
abundance of their scientific production have achieved an eminent 
position in the science. This is a fortunate circumstance which en- 
courages the belief that our official statistics themselves will make 
rapid progress. 

Let us trace in a few words the principal lines of reform which have 
been effectuated. It was not a question, as one might believe in 
reading the title of the law of July, 1926, of readjustment of the sta- 
tistical services, but it was rather a true reconstruction that was sought. 

1. The former General Direction of Statistics which was a part of 
the Ministry of National Economy was discontinued and in its place 
there was created a Central Institute of Statistics which was placed 
under the direct authority of the head of the Government, and under 
the presidency of M. Gini. 

2. This Institute centralizes only the statistical reports of such 
phenomena as are of the greatest interest for public administration. 
They are the population censuses, the industrial censuses, statistics of 
changes in civil status, and some incidental investigations. The 
responsibility for further reports is left to the ministerial and other 
bureaus which are obliged, however, to follow the coordination plans 
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prepared by the Institute itself. Thus there has been brought about 
a compromise between centralization and decentralization of statistical 
reporting, a solution which profits by the advantages of the two sys- 
tems while avoiding the disadvantages of an arrangement based solely 
upon either. The former‘has the advantage of better coordination; 
the latter that of the greatest elasticity and promptness in execution. 
In the system adopted decentralization generally prevails for such as is 
concerned with the material execution of researches, and centraliza- 
tion for such as is involved in the program, method, control of reports, 
coordination, and the study of results. 

3. The law has not placed limits in the various fields exploited 
heretofore, or to be exploited, in all administrations of the state, 
public or syndicalist; but it gave the Central Institute the privilege of 
taking the initiative, recognizing the right to entrust special researches 
to those administrations. In exchange it permitted the latter, upon 
the authorization of the head of the Government, to make use of the 
resources and personnel of the Institute for the procuring of special 
statistics. The most rigorous unity of supervision and the homo- 
geneity of statistical reports are assured by the stipulation that the 
Central Institute alone shall map out the plan of reports, not only for 
the procuring of statistics which it advises or allots to the public 
administrations or to the State, but for those also that the aforesaid 
must execute on their own account. There was therefore good reason 
for the designation ‘‘Central Institute of Statistics.” 

4. The Institute, under M. Mancini as Director General, consists 
of four divisions: (a) Division of general, administrative, cultural and 
incidental affairs (M. Antonucci); (b) Division of demographic, 
sanitary and public welfare statistics (M. De Berardinis) ; (c) Division 
of censuses and of commercial, industrial or general investigations 
(M. Giusti); (d) Bureau of research (special studies, mathematical 
section, cartography, translations, etc.). A new division is soon to 
be allocated to the Institute, that of agricultural statistics, one of the 
immediate ends of which will be achieved by the creation of agricultural 
and forestry registries throughout Italy. 

5. The bureau of research occupies a prominent place in the In- 
stitute. It is directed by M. Livi, Professor at the University of 
Rome. In truth, if the task of reporting data constitutes the basis of 
statistics, it is only in the second phase, in the elaboration of these 
data that the supreme height is reached in the discovery of the laws 
which control manifestations of mass phenomena, and, less in im- 
portance, in furnishing to the Government the most essential informa- 
tion in the demographic, social, sanitary, economic, financial and other 
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fields. The crystallization of the activity of this Bureau will come 
through the publication of a series of monographs of a purely scientific 
character in setting forth the most important reports of the Institute. 
On the other hand, it will publish a statistical annual, a monthly 
bulletin, annual statements of population changes, causes of death, 
the results of censuses, as well as results of all other investigations of a 
general character. 

6. The operation of the Central Institute is supervised by the 
Superior Council of Statistics which, with M. Gini as President, is 
made up of authorities on statistical and economic methods (MM. 
Amoroso, Benini, Coletti, Livi, Savorgnan); of public officials (MM. 
De Michelis, Tosti, Troise); of representatives of the syndicalist 
organizations (MM. Olivetti, Serpieri, Sitta); and of the Director 
General of the Institute (M. Mancini). Finally, the preliminary dis- 
cussion of the fields to be covered by the Institute is carried on by a 
number of appropriately specialized research commissions. 

This in brief is the reconstruction of Italian official statistics which 
the National Government has achieved. The preliminary stages of 
the program have been completed in the first half of 1927 by the issue 
of publications on the results of the census of 1921 and on the causes of 
death, which had been neglected under the former Direction of Sta- 
tistics. There is, further, in preparation an industrial census which 
will take place at the end of the summer; statistical bureaus will be 
established in all cities of 50,000 and over, and in connection with all 
chambers of commerce. Finally, a new series of index numbers has 
been prepared and much other work undertaken the results of which 
cannot be described here. 
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RATIO OF WAGES TO VALUE ADDED BY 
MANUFACTURING PROCESSES IN MASSACHUSETTS 
AND THE UNITED STATES 


By M. EstTeLtte Movunton AnD RaymMonp E. PANNIER 


INTRODUCTION 


Years ago, a popular song declared that ‘‘the rich are getting richer, 
and the poor are getting poorer.’”’ The economics of this statement has 
been challenged at various times. It is the purpose of this study to 
make a statistical inquiry into the !atter half of the above quotation, in 
order to determine its truth or falsity in regard to industrial wage 
earners. The general plan of the study is as follows: Eight leading in- 
dustries in Massachusetts were studied on the basis of the federal and 
state census reports for the years 1899, 1904, 1909, and 1913 to 1925. 
The amount of wages earned was divided by the total increment due to 
manufacturing processes (the difference between the value of the 
finished product and the value of the raw materials), and the percentage 
of the increment that went to labor was thereby derived. The chart 
accompanying this study was then made from the data thus determined. 
A like study was made for the same eight industries in the United 
States asa whole. The federal census was taken only every five years 
until 1919, however, and every two years thereafter, so the federal 
statistics are not as complete as those for Massachusetts alone. Per- 
centages were then derived and charts made for industry as a whole in 
Massachusetts and in the United States. 

Before proceeding to discuss the actual data and to form our con- 
clusions, it is necessary to defend the statistical value of the percentage 
used. The ratio between total increment and wages has been attacked 
as meaningless because of the many variable factors involved. It is 
argued that strikes, periods of depression, changes in prices and values 
of both raw materials and finished products, and the replacement of 
hand labor by machinery all affect this ratio. Any collection of ratios 
is meaningless no matter with what data it may deal unless the variable 
factors which are sure to be found in any economic problem are stated, 
and their influences explained. A more intensive study than this one 
would of necessity treat this matter in a more exhaustive way than we 
have been able to do. 
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GENERAL COMMENTS ON THE PROPORTION OF WAGES TO TOTAL 


INCREMENT 


As suggested in the above paragraph, we will first discuss the effect 
of the varieble factors in a general way. In the first place it is neces- 
sary to make allowance for the business cycle, particularly for periods of 
depression. During such periods it is natural to suppose that the pro- 
portion of wages to total increment is less than during other stages of 
the business cycle. It is also true that inventory values fall during such 
periods, but industry as a whole is very unwilling to admit these losses. 
However, production is stopped just as soon as the market takes the 
turn that causes these falling inventory values. Examination of the 
data shows that a period of depression is accompanied with a low ratio 
of wages to total increment. 

In the second place, allowance must be made for labor union activity. 
Labor unions are always more powerful in prosperous times than in dull 
times. Consequently, we would expect a higher ratio of earnings to 
total increment in prosperous years than at any other time. 

In the third place, consideration must be given to the replacement of 
hard labor by machine methods. All the industries which we have 
studied have been affected but little by this factor, due either to the 
fact that the increase in the use of machinery has been gradual since 
1900 or to the fact that the industry was thoroughly mechanized at the 
beginning. 

No study covering the years within the period here considered can 
overlook the abnormal conditions caused by the war. The data we 
have compiled show that during the war period labor was receiving a 
smaller and smaller proportion of the total increment. In other words, 
wages did not increase nearly so fast as prices. Undoubtedly this was 
de in part to the willingness of the American Federation of Labor to 
follow the suggestion of President Wilson that the unions refrain from 
strikes during this period. Immediately after the war, however, there 
was a rapid increase in wages noted in a previous paragraph. 


COMMENTS ON THE DATA COMPILED FOR THE EIGHT INDUSTRIES 
STUDIED 


If it is possible to speak of a composite curve embodying the general 
characteristics of the data presented herewith, it would be described as 
follows: After the ratio of wages to total increment had reached, in 
1913, a peak higher than any yet attained, there was a constant and 
steady drop during the war period. This was followed by the regain- 
ing of most of the ground during the period of high business activity 
immediately following the war. During the depression of 1921 there 
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was a further reduction in the ratio, although in most cases not so 
marked as the drop during the war. Since 1921, the ratio has increased 
steadily in most cases, although it is not yet up to the level of 1913. 
Of course, each curve presents some exception to this general descrip- 
tion. 

The graphs for all industries in the country and for the separate in- 
dustries throughout the country show ratios consistently below those 
for the state of Massachusetts. This is due to the advanced labor laws 
and strength of the trade unions in this state. Trend cannot be 
estimated, however, from the federal statistics because the census was 
not taken annually. Thus in Massachusetts alone have we been able 
to trace a continuous trend of the ratio of wages to the total increment 
resulting from the manufacturing process. Variations from this 
general trend are to be expected in the individual industries. Let us 
now proceed to an examination of the ratios for each of the eight in- 
dustries under consideration. 

1. Boots and Shoes (including cut stock and findings). Considering 
the entire period studied, the total fluctuations of the ratio for boots and 
shoes is the smallest of any of the industries, there being a range be- 
tween 1904 and 1925 of only 10 per cent. With the exception of the 
year 1918 the curve follows very closely the general trend noted above. 
The proportion in this industry, it is interesting to observe, is usually 
higher than that in any of the other industries studied except cotton 
goods. This situation is undoubtedly due to the strong position of or- 
ganized labor in the boot and shoe industry in Massachusetts. The 
ratios for this industry in the entire country, as derived from the federal 
statistics, show very little variation from the ratios for Massachusetts, 
probably due to the great importance of the Massachusetts boot and 
shoe industry in the totals for the country. 

2. Cotton Goods (including small wares), and 3. Woolen and Worsted 
Goods (including felt hats). These two groups are so similar that they 
are here treated together. They both show a rapid drop during the 
war period, far more pronounced than in any of the other industries. 
Since 1918, both industries have gained steadily, until 1925 when the 
ratio in both industries was greater than during the pre-war period. 
This again may be attributed to the strength of union labor in these 
industries. Also, there is seen in these high ratios one very good reason 
for the precarious condition of the textile industry in Massachusetts. 
In comparison with the federal statistics it is found that the ratio for 
the entire country is much closer to that for the state in the woolen in- 
dustry than in cotton manufacture. The explanation of this is the 
much lower wage rate prevailing in the southern cotton mills than that 
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which exists in Massachusetts. The greatest difference in any one 
year between the federal and state ratios in the cotton industry is 6.8 
per cent, whereas the greatest difference in woolen goods is only 2.7 per 
cent. In other words, although the per cent which labor received of 
the total value added varied considerably over the entire period, the 
ratios for Massachusetts and the entire country followed each other 
very closely from one census to the next. The ratios in cotton seemed 
to be coming closer and closer together during the war period, but in 
1923 they diverged widely again. It will be interesting to observe 
what the federal statistics for 1925 will disclose in this respect. 

4. Foundry and Machine Shop Products. The ratio for the state has 
not yet come back to the pre-war standard, despite the gains both before 
and after the depression period. Except for the great drop in 1921, the 
ratio in this industry has been relatively constant, showing even less 
variation than the ratio for boots and shoes. The ratio for the whole 
country in this industry has also been quite constant, and as in the case 
of the other industries was usually below the ratio for Massachusetts. 

5. Electrical Machinery, etc. The curve for this industry varies 
considerably from the general description given above. Since the 
highest point on the curve itself was reached in 1920, there has been a 
steady drop, except for a temporary recovery in 1923, until 1925 when 
the ratio was 15 points lower than it was when the decline began only 
five years earlier. A possible explanation of this fact is the great com- 
petition that has appeared in the electrical equipment line. The 
federal ratios in 1919, 1921, and 1923 show a wide variation from the 
Massachusetts ratios, and do not show the same wide fluctuations in 
the ratio of wages to increment that is found in the state statistics. 

6. Paper and Wood Pulp. Here again we find wide variation from 
the general trend discussed earlier in this study. After sharp drops in 
the ratio during the first two years of the war, gains bringing the ratio 
back almost to the point where it was in 1913 are noted. In 1920, 
however, there was another sharp drop followed by a phenomenal rise 
of almost 18 points in 1921, the depression year. The ratio has since 
dropped again, but in 1925 was still above the pre-war standard. 
Whatever the cause of the sharp rise in 1921, it was evidently character- 
istic of the paper and wood pulp industry all over the country, for the 
federal statistics reveal a ratio much higher in that year than in any of 
the other years studied in this survey. In fact, as nearly as can be de- 
termined from the less frequent federal data, the ratio throughout the 
country was very close to that in Massachusetts. 

7. Leather (tanned, curried and finished). This industry conforms to 
the general trend, although with fluctuations almost as wide as those 
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present in the textile industry. The curve also bears a resemblance to 
that for paper and wood pulp in that it began to rise in 1916, but the 
gain was only slight as compared with that for paper and wood pulp, 
and did not approach even the ratio for 1913. There seems to be no 
correlation whatever between the Massachusetts and the federal 
curves for this particular industry. The two ratios were practically 
the same in 1904, but for the next three federal census years the ratio 
for the whole country averaged about 10 points lower than that for 
Massachusetts. Then, curiously enough, in 1921 the federal statistics 
disclose a ratio almost that much higher than the state ratio, only to 
show in 1923 a drop back again below the state ratio. 

8. Slaughtering and Meat Packing. Here we find another very er- 
ratic curve. The ratio showed a big gain for 1915, followed by a drop 
to below the 1913 standard. In 1919 there was an enormous increase 
in the ratio, and since that time it has zigzagged back and forth in an 
apparently meaningless fashion. The fluctuations, however, have 
diminished since 1923 and the ratio seems to be settling down at a 
much higher level than prevailed during the pre-war period. This is 
the only industry the ratio for which showed a marked gain over condi- 
tions fifteen years ago. The federal ratio corresponded very closely 
with that of the state until after the war. The gain in 1919 was not 
nearly so great for the entire country as it was for Massachusetts, there 
being a spread between the federal and state ratios in that year of 
14.7 points. In 1923 the ratio for the country was almost 7 points 
below that for tie state. 


CONCLUSION 


We have found that the ratio of wages to total increment fluctuates, 
especially with the business cycle. Consequently it is impossible to 
give a simple positive or negative answer to the question ‘‘are the poor 
getting poorer?” But as related to 1913 as a base, it can be seen that 
for all manufacturing industries combined in Massachusetts the 
proportion which labor received of the total increment due to the 
manufacturing processes was somewhat less in 1925 than in 1913. 
In the eight industries individually, however, there are exceptions to 
this general conclusion. 

SOURCES 
Massachusetts Statistics of Manufacturers, 1913-1925. 
United States Census of Manufactures, 1899, 1904, 1909, 1914, 1919, 


1921, and 1923. 
The first problem faced in making this study was that of securing 
comparable data. Massachusetts statistics up to 1909 are sc. com- 
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parable with the data after that year because of a general reorganiza- 
tion of classifications at that time and also because in the earlier years 
sampling was the policy generally followed. Therefore the federal 
data for 1899, 1904, and 1909 were used for both Massachusetts and the 
United States. After determining the comparability of the state data 
for different years, it was necessary to determine the comparability of 
state and federal figures. Asa result of this determination it was found 
necessary to eliminate the printing and publishing industry, which we 
had planned to include in this study, because there was such wide 
variation between the type of data gathered for this industry by the 
state and that gathered by the federal government. In order to make 
the data for these industries in the state comparable with federal data, 
it was also necessary to make additions to the state data on boots and 
shoes by adding cut stock and findings; to the state figures on cotton 
goods by adding small wares; and to the state figures on woolen and 


worsted goods by adding felt hats. 
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KARSTEN’S INTERPRETATION OF THE HARVARD 
BUSINESS INDEXES! 


By Arvin H. Hansen, University of Minnesota 


In the December, 1926, issue of this JouRNAL, Mr. Karsten presented 
a new interpretation of the Harvard Business Indexes. The net con- 
clusions to be drawn from Mr. Karsten’s paper, are, as I see it, as fol- 
lows: (1) that the A Curve is a resultant of the B Curve, and (2) that 
therefore the A Curve cannot properly be utilized as a forecaster of the B 
Curve. 

I wish, in this note, to raise two questions: (1) whether the second 
point referred to above follows necessarily from the first, and (2) 
whether Mr. Karsten has not inadequately stated the logic of the 
relationship between the A Curve and the B Curve. 

First as to the logic of the relationship between the A Curve and the 
B Curve. Mr. Karsten does not, be it noted, challenge the accuracy 
of the Harvard curves from the standpoint of the picture they present 
of business conditions. In other words, the A Curve does in fact 
precede the B Curve. But the A Curve is, he holds, nevertheless, a 
resultant of the B Curve. Now the A Curve is, it seems to me, a 
function of: (1) the prospective profit rate, and (2) the rate of capitali- 
zation. The rate of profit is a function of: (1) the margin between the 
unit selling price and the unit cost of production (account being here 
taken of both prime and supplementary costs), and (2) the volume of 
units sold. We know that the margin between costs and selling prices 
is widest several months before the B Curve declines, and also that the 
rate of increase of the physical volume of trade slows down months be- 
fore the B Curve declines, owing to the fact that when labor and capital 
approach the point of full employment the physical volume of trade 
cannot possibly continue to rise as rapidly as is possible in the first 
phase of prosperity. It follows from the above that the A Curve pre- 
cedes the B Curve. The rate of profit, being the product of the two 
sets of forces enumerated above, is, therefore, a resultant of the busi- 
ness conditions represented by the Harvard B Curve, since the B Curve 
is a composite of the commodity price level and the volume of trade as 

1 Eprror’s Nore.—aAttention is called to other papers on this subject which appeared in the April, 
1927, issue of the Review of Economic Statistics of the Harvard Economic Service, p. 74, ‘‘ The Construo- 
tion and Interpretation of the Harvard Index of Business Conditions,” by C. J. Bullock, W. M. Per- 


sons, and W. L. Crum; and p. 93, “Money Rates, Bond Yields, and Security Prices,” by Warren M. 
Persons. 
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represented by bank debits. This aspect of the relation between the A 
and the B Curves Mr. Karsten has taken no account of. 

With respect to the relationship between the two curves, Mr. Kars- 
ten centers his attention exclusively on the rate of capitalization, and 
even here he considers the demand side alone. But the rate of capital- 
ization is affected by the supply of loanable funds as well as by the 
demand for loanable funds. Toward the end of the period of prosperity, 
business men require much capital. Banks and corporations dispose of 
the securities which they have on hand, throwing them upon the general 
investment market, and, at the same time, they offer new issues of 
securities for sale. This is the side of the matter that Mr. Karsten 
sees. But he overlooks the fact that the supply of loanable funds is 
inadequate to equate the increased demand for funds at the old interest 
rate, and therefore the price of capital rises. This pressure for funds 
and the inadequacy of the supply of funds, begins when prosperity has 
reached a point of great intensity, and continues to the end of the crisis. 
The rate of interest, and therefore the rate of capitalization, thus con- 
tinues to rise throughout the period indicated above. Thus the A 
Curve, which is a function of the rate of profit and the rate of capitali- 
zation, continues to fall, on the one side, so long as the profit rate con- 
tinues to fall, and, on the other side, so long as the rate of capitalization 
continues to rise. 

It does not seem to me, therefore, that there is anything novel in the 
notion that the A Curve is a resultant of the B Curve. But the statis- 
tical verification of this relationship is indeed novel and most interest- 
ing. That the cumulative constructed by Mr. Karsten from the Harvard 
B Curve correlates so closely with the actual A Curve is remarkable. 
But it is only a proof that the limited number of series included in the 
B and A Curves are representative to a marked degree of “‘ business”’ 
and “speculation” respectively. 

But is the A Curve useless as a forecaster if it is granted that it is a 
resultant of the B. Curve? Not at all. We know, from J. M. Clark 
and others, that the demand for fixed capital, being a derived demand 
dependent finally upon the demand for consumers goods, fluctuates 
synchronously not with the increase and decrease in the demand for 
consumers goods, but with the rates of increase or decrease in the de- 
mand for consumers goods. Building contracts precede general busi- 
ness by several months. It follows that the index of building permits 
or building contracts has generally been recognized as an excellent 
forecaster. Now to be sure, if we had accurate measurements of the 
rates of increase and decrease in the demand for finished goods, the 
building contracts index would be useless as a forecaster. But we have 
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no such data. What we do have is this: Millions of transactions im- 
pinge upon hundreds of thousands of individuals, and these impulses are 
quickly transmitted from stage to stage in the exchange process until 
finally they cause the leaders in the constructional industry to act in a 
certain manner, which action shows up statistically in the index of 
building contracts. The actions of the leaders in the constructional 
industry do not antedate those of the leaders in other industries; 
rather they follow therefrom. But it is nevertheless true that one of 
the first statistically known facts that registers all these interrelated 
transactions is the index of building contracts. The building con- 
tracts index registers in part the fluctuations in the rates of increase or 
decrease in the demand for finished goods, and these rates of increase 
and decrease anticipate and forecast absolute fluctuations in demand. 
It is in this sense that the building contracts index is a forecaster. 

Mr. Karsten criticizes the A Curve as a forecaster because it is based 
on present and past data. He wonders “by what occult powers the 
speculating world in mass could accomplish a thing which no speculator 
seems able to do individually, and that is correctly to divine and dis- 
count the future of business, for between six months and a year in ad- 
vance.”’ But of course forecasting is not intuitional divination at all. 
It is merely an interpretation of past and present trends, an analysis of 
interrelated data and an isolation of those data which reveal the inter- 
relations most clearly. Indexes whose absolute movements correlate 
synchronously with fluctuations in the rates of change of other data are 
especially useful for forecasting purposes... Mr. Karsten’s own article 
gives an excellent illustration of the great value of the Harvard A Curve 
as a forecaster. He concludes that because of the interrelation of the 
A Curve and the B Curve, the A Curve begins to turn down (or up, as 
the case may be) at the point when the B Curve is crossing the normal 
line. From this he establishes his trend for the B Curve. He uses the 
A Curve to determine the exact point at which the B Curve crosses the 
normal line. Why? Because it is a simple matter to detect the high 
and low points on the A Curve, but it is not so simple to find the points 
at which B crosses the normal line. It is just for this reason that the A 
Curve is useful as a forecaster. It shows us where we are in the busi- 
ness curve, a thing which cannot be discovered by examining the busi- 
ness curve alone. 


1In Mr. Karsten's analysis, however, the A Curve is the cumulative of the B Curve inverted. 
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THE GROWTH CURVE 
By Grorce R. Daviss, University of North Dakota 


It has often been assumed that if there is any statistical norm 
representing the course of biological growth, it is to be found in the 
area of the normal frequency curve. That is, the increments of growth 
are assumed to follow the law of chance as represented by the curve of 
dispersion, and the growth itself is therefore assumed to follow the 
integration of that curve, taken from its initial point rather than 
from the center as shown in the usual table. It has been assumed also 
that the growth of population during a dynamic pulsation, or during 
the maturing of a colony into a nation, follows this or some similar 
norm; and the well-known work of Professor Pearl on population 
growth has given a considerable degree of justification to the as- 
sumption. 

It is not the purpose of this paper to discuss the validity of such a 
norm as a means of forecasting future growth or the saturation point 
of numbers in any given country. It is assumed that a growth norm 
has a value as indicating the probable trend, and as suggestive of the 
changes accompanying an increasing pressure of population. The 
question to be raised is rather the mechanics of the growth curve. 

There are at least three available methods of fitting a growth trend. 
The first, which has been used by Professor Pearl, fits by the method 
of least squares the so-called logistic curve, 

k 


ie 1+ert 


in which 0 and k are the lower and upper asymptotes, respectively. 
The equation is an adaptation of Newton’s law of cooling. Applying 
this method to the population data of the United States, Professor 
Pearl obtains an upper asymptote of about 197,000,000, which he ac- 
cordingly regards as the saturation point of population under existing 
conditions of growth. 

A second method which is naturally suggested by the theoretical 
view of a growth pulsation is to fit the trend by a direct use of the 
normal curve of dispersion or its integration. A simple method of 
doing this is indicated in Table I. The calculation uses the census 
data adjusted to January first of each census year, the adjustment being 
made by geometric interpolation from 1790 to 1860, and by arithmetic 
interpolation in later years. The first differences of these data may 
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be considered as approximating a part of the normal frequency curve, 
and their logarithms should therefore conform approximately to a 
quadratic parabola. Such a parabola is therefore fitted to the loga- 
rithms of the first differences, and the antilogarithms are taken (column 
5). These antilogarithms, adjusted to the total of the first differences, 
may be taken as the trend of the increments of growth. They may 
now be successively added to the first population item to obtain a 
trend (column 7) which when adjusted to the same total as the data 
may be taken as the growth trend of population (column 8). The 
trend may readily be projected to successive decades. The upper 
asymptote, or saturation point, is easily obtained by the method 
indicated at the foot of the table. It proves to be 251 million, a 
marked increase over the result obtained by Professor Pearl. 


























TABLE I 
NORMAL FREQUENCY AREA CURVE FITTED TO POPULATION OF UNITED STATES, 
1790-1920 
(1) (2) (3) (4) (5) (6) (7) (8) 

Year /|Population A; log At Parabola | Trend A; | Trend A; | Trend (1)/Trend (1) 
(millions) fitting (3) |antilog (4) | adjusted Zz (6) | adjusted 
3700... 3.860 3.860 3.834 
1800... 5.215 1.355 .1319 .10751 1.281 1.3146 5.175 5.149 
1810... 7.109 1.894 . 2774 26208 1.828 1.8616 7.036 7.010 
1820... 9.477 2.368 . 3744 .40497 2.541 2.5746 9.611 9.585 
1830... 12.706 3.232 .5095 .53618 3.437 3.4706 13.081 13.055 
1840.. 16.869 4.160 .6191 .65571 4.526 4.5596 17 .641 17.615 
1850.. 22.898 6.029 . 7802 . 76356 5.802 5.8356 23.477 23.451 
1860. . 31.047 8.149 9111 . 85973 7.240 7.2736 30.750 30.724 
1870 39.388 8.341 .9212 .94422 8.795 8.8286 39.579 39.553 
1880 49.623 10.235 1.0101 1.01703 10.400 10.4336 50.012 49.986 
1890 62.404 12.781 1. 1066 1.07816 11.97: 12.0056 62.018 61.992 
1900 75.320 12.916 1.1111 1.12761 13.416 13.4496 75.468 75.442 
1910 91.560 16.240 1.2106 1. 16538 14.635 14.6686 90.136 90.110 
1920 105.711 14.151 1.1508 1.19147 15.541 15.5746 | 105.711 105.685 
533.190 101.851 10.1140 10.11361 101.414 101.8508 | 533.555 | 533.191 

Trend projected 
Lie ak we ae wna eukibanea ee eKine 1. 20588 16.065 16.0986 | 121.809 | 121.783 
1940... (Aw ezachiwercenntktbetaewnle 1. 20861 16. 166 16.1996 138.009 137 .983 
a a ean ee cnc alia a 1.19966 15.837 15.8706 | 153.880 | 153.854 
a ad 6 edKGeekes cede te endetw ae we 1.17903 15.102 15.1356 | 169.015 | 168.989 























Equation of parabola (4): y =0.85973 +0.09033x —0.005842* 
point of origin, decade 1850-1860, or 1855 
Mid-date (mode) of parabola: by differentiating equation, 
dy /dz =0.09033—2(0.00584)z =0 
z =7.736, or 77.36 years 
Mid-date = 1855+77.36 =1932.36 
Standard deviation; by equation cr? = —0.5 log e 
.005842? = .21715 
z =6.098, or 60.98 years 
Upper asymptote, twice final trend interpolated at 1932.4 =251 million 


It is easy to see that the method just described gives only an ap- 
proximation to the theoretical result sought. It would be easy to 
improve the method by the introduction of certain refinements, but 
the rectification simply raises the upper asymptote a little without 
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materially altering the fit of the trend. Since the contrast with 
Professor Pearl’s result is sufficiently clear, the figures are allowed 
to stand without adding further complications. The weaknesses of the 
method may, however, be briefly noted. 

In the first place a method which involves the logarithmic first 
differences is not likely to prove satisfactory unless the data are highly 
regular. Apparently the sharp decline in the last increment, 1910 to 
1920, lowers the parabola somewhat, reducing the standard deviation 
and the date of the mid-point (zero a). This inference may be sup- 
ported by plotting the logarithms of the observations, preferably 
smoothed, and adjusting to them the logarithms of the area of the 
normal curve at certain positions, as at—}0, —1lo, and —1l4e. In fact 
it is quite easy to fit the curve directly in this way, and with a fair 
degree of accuracy, using a simple mechanical drafting arrangement 
involving the loci of the three normal points. 

In the second place the method gives too much weight to the first 
and last items in the data by distributing the smoothed first differences 
between them (column 7). This over-emphasis could be avoided by 
adjusting the first differences between two points determined by a 
smoothing process. Nevertheless, in spite of the objections that may 
be urged against it, the method is a simple and easy one, sufficient 
for the purpose at hand. The close fit of the trend, and the normal 
frequency form of the first differences, are shown in Chart I. 

The marked increase in the upper asymptote obtained by the 
normal dispersion method as compared with Professor Pearl’s method 
is apparently due to inherent differences in the types of curves used 
rather than in the specific methods of fitting. If both curves are 
computed to approximately the same scale and plotted, it will be seen 
that the latter does not exhibit the extended flattened portion near the 
center such as appears in Chart I from about 1910 to 1960. Instead, 
it maintains a considerable degree of curvature to the center, where it 
reverses the curvature. In fitting this curve to the data at hand, 
it therefore reaches the center a decade or two earlier, giving a lower 
asymptote at the upper extreme. 

A third method of computing a growth curve is illustrated in Table 
II, where the Gompertz curve is fitted to the population data from 1810 
to 1920. The omission of the items belonging to the first two censuses 
is due to the necessity of totaling the observations in three groups 
covering equal periods of time.! The equations used in computing 


1 When n is not divisible by three, the curve may be fitted as follows. Repeat each item three times 
to fall at early, middle, and latter parts of each year, and separate these items into three equal groups 
as before. The intersection between groups will isolate one from two repeated items (m). Call the 
items in the years adjacent to the intersected year m: and m2; adjacent to the one and two items, re- 
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CHART I 


POPULATION IN THE UNITED STATES, 1790-1920, FITTED WITH A TREND DERIVED 
FROM THE FIRST DIFFERENCES TREATED LOGARITHMICALLY AS A 
PART OF A NORMAL FREQUENCY CURVE 
' iu ! t LJ ' ' ' ' ' ' ' ' 
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the parameters are adapted from R. B. Prescott’s work (this JouRNAL, 
Vol. XVIII, pages 471-479), and are adjusted to a point of origin at 
the first item. The symbols are sufficiently indicated in the table, 
and the method of calculation is obvious. The results disclose a 
remarkably high upper asymptote of over a billion population.' 


Thus by three separate calculations, each of them apparently valid, 


we find that the probable saturation limit of population in the United 
States, under present conditions of development, may be 197 million, 





spectively. In summating the three groups, add to the m side and substract from the m, side 
0.183 m;—0.1434m—0.04m. This modification in effect places the three repeated items on quadratic 
parabolas connecting the adjacent mid-year items. Solve for the mid-year items, z=1, z=4, etc. 
Applied to population data from 1790 to 1920, this method raises somewhat the saturation level. 


1 This high asymptote is partly explained by the fact that the Gompertz curve is not symmetrical. 
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TABLE II 
GOMPERTZ CURVE FITTED TO POPULATION OF UNITED STATES, 1810-1920 
(1) (2) (3) (4) (5) (6) (7) 
Year Population : Trend (1) 
(millions) log (1) x 3 Be® A+Be* antilog (6) 
7.109 . 85181 2.31721 . 82334 6.658 
9.477 . 97667 2.16638 .97417 9.423 
12.709 1.10411 Pa 2.02537 1.11518 13 .037 
16.869 1.22709 4.15968 1.89354 1.24701 17.661 
22.898 1.35980 1.77029 1.37026 23 . 456 
31.047 1.49202 1.65506 1.48549 30.584 
39.388 | 1.59536 p>) dy 1.54733 | 1.59322 39.194 
49.623 1.69568 6.14286 1.98318 1.44661 1.69394 49.424 
62.404 1.79521 1.35245 1.78810 61.390 
75.320 1.87691 1.26442 1.87613 75.185 
91.560 1.96171 ys dz 1.18212 1.95843 72 
105.711 2.02412 7.65795 1.51509 1.10518 2.03537 108.485 
1/3 n log c=log d:—log di Equation of trend: 
4 log c =.18044 —.29736 y=ab™ 
log c = —.02923 log y =log a+c” log b 
c =.93491 or, taking log a=A and log b}=B 
log y=A+Bc* 
B(*4_1)2+(c—1) =a: Point of origin at first item, 1810 
— .85585B =1.98318 Upper asymptote, (a) 
B=—2.31721 a=antilog A=1382 million 


1/3 nA =D1 —di+(c™/* —-1) 
4A =4,15968 —1.98318+.2360°16 
A =3.14055 


251 million, or 1,382 million. Evidently, if any one of the three 
methods is theoretically correct, the others are not even approxima- 
tions. But which one, if any, represents the statistical norm of growth? 
Professor Pearl’s method has the right of way by virtue of its having 
been fitted successfully to a considerable body of data. But this 
proves little, for any one of the curves will fit fairly well any set of 
observations representing a partial or a relatively complete growth 
period. Theoretically the normal dispersion method appears to have 
the best of the argument, but this is only an unproved assumption. 
The third method has been used in business forecasting, where it is 
applied to the middle or upper portions of growth curves, in which 
cases it appears very satisfactory, particularly inasmuch as it may be 
fitted tentatively by a simple and rapid graphic method. As applied 
to population in the United States, however, it is entirely out of 
harmony with estimates of the saturation point calculated on the basis 
of available resources; and may therefore be ruled out. Such estimates, 
it may be said, suggest the second, or normal dispersion method, as 
being the closest to the real norm. 

Social conditions are most vitally related to the varying phases 
of the growth curve. It would be well if the statistical norm could 
be stated with a greater degree of certainty than now obtains. 
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A WEEKLY INDEX OF DISTRIBUTION 
By D. C. Exuiorr 


Within the past few years, the science of measuring business move- 
ments has advanced to the point where it has become possible to con- 
struct indices of the true state of business activity, after eliminating 
the effects of seasonal factors and normal growth. As a result we 
now have several excellent curves of general business, such as those 
constructed by the Annalist and the American Telephone and Telegraph 
Company. These various curves, as is well known, are combinations 
of different business series of importance, such as pig-iron production; 
they agree quite closely with one another, and may be said to give a 
reasonably accurate picture of the course of business. 

One of the most important individual series is that of car loadings. 
This is considered by many authorities to be the best measure of general 
distribution, and by some it is even called the best single indicator of 
general business. The actual figures on car loadings are published 
weekly, and several monthly indices of car loadings, based on daily 
average cars loaded and adjusted for seasonal and long-time trend, 
have been worked out, among which may be mentioned those of the 
Harvard Committee on Economic Research and the Annalist. As far 
as the author is aware, however, no weekly adjusted index of distribution 
based on car loadings has as yet appeared, and such an index is pre- 
sented in the chart appearing herewith. 

This index was first projected with the idea of obtaining an up-to-date 
indicator of distribution, and that is all that it pretends to be in and of 
itself. At the same time, a study of the chart shows a very close 
correlation between the weekly car loadings index and the Annalist 
monthly index of general business activity, which is one of the standard 
business curves. It is realized, of course, that no single indicator 
can at all times portray the state of business as a whole; but at the same 
time, the close relationship between the two curves on the chart shows 
that the weekly car loadings index is ordinarily a valuable guide to 
changes in the trend of business, although in itself it is a measure of 
distribution rather than of general business activity. 

The outstanding advantage of the weekly car loadings index is that 
it is almost up-to-date, whereas the various general business curves are 
not. Car loadings figures for the week ending Saturday, May 28, for 
example, were published in the papers of Thursday, June 9, and the 








2 
A) 
~~ 
3 
~ 
=) 
S 
~” 
ie) 
<x 
~~ 
3 
=) 
‘— 
~~ 
a) 
S 
3 
~~ 
A 
2 
3 
‘3 
S 
<x 














XZONI SS3NISNE ATHLNOW LSINVNNY on wee 
X3QNI SPNiGvOl Yv> ATWaaM 


S3wIdLS uy 





mms 
Wwo> 


th 


ws 
“woo 








A 









































“ON3SY¥L YVINDIS GNV SNOLLVINWA TWNOSV3S YO G3ALSNCIV HLOS —‘ALIAILDV SSSNISNE JO XSQNI 


LSMVNNV 3HL HLIM GaevdNOD ‘SONIGVOT YD NO d3SVa ‘NOILNAGILSIG JO X30NI ATWSSM 














ail 


«pti 
~~ 


ee 


Fine iT iit 








PINE SY PUT RG 








PAYEE LLIN PIO LTR 


i 
e 
r 








107] Notes 377 


index is therefore only 12 days behind at all times. On the other hand, 
the monthly business curves cannot, from their very nature, be up-to- 
date; being composed of a number of individual series, they are not 
complete until the information for every series has been received, and 
in some cases such information is a month behind. For example, on 
June 9th the car loadings index was available for the last week in 
May (the latest point on the chart below), while the Annalist’s final 
figure for April did not appear until June 10. The car loadings index 
was therefore a month ahead of that of the Annalist. The latter’s 
preliminary figure for May, subject to revision, was published on 
June 17; by this time the car loadings index showed the first week in 
June. 

In interpreting the car loadings index, it is obvious that a change in 
one week may not indicate anything in particular, and the fluctuations 
for the last three or four weeks should be studied before drawing con- 
clusions. Looked at in this way, the index frequently becomes useful 
in indicating a turn in business which is later confirmed by the Annalist’s 
monthly curve as soon as available. In 1921, for instance, there was 
a rather pronounced uptrend in car loadings early in the year over a 
period of eight or nine weeks, which was confirmed by a definite up- 
swing in the monthly curve following the low point of depression. In 
the spring of 1923 the car loadings index started on a decline which, 
with temporary interruptions, continued the rest of the year. The 
first few weeks of this occurred while the general business curve was 
still advancing, but the latter soon turned and also declined for the 
rest of the year. In the spring of 1924, after a recovery, a very marked 
slump in the distribution index became apparent, again confirmed by 
the Annalist index when that became available. The last few weeks 
in 1926 brought a brief but pronounced fall in car loadings, followed by 
an equally sharp recovery. This was reflected in the decline in the 
Annalist curve culminating in January, followed by the rapid rise in 
February. At the present time another slump in the car loadings 
index has been apparent for some weeks, pointing to a further recession 
in business during May; but the coal strike is a factor just now, and it 
remains to be seen whether the Annalist index for May will reflect 
the recent recession in the distribution index. 

There are several disadvantages to the car loadings curve which it is 
only fair to point out. In the first place, the rate of “normal growth” 
has been affected by the unusually rapid increase in less-than-carload 


1Since writing the above, the Annalist’s final figure for May has appeared, being slightly higher 
than that for April. The preliminary figure for June, however, declined to 101.6 from 103.7 in May. 
The car loadings index fluctuated around 95 in June, but fell to 91.1 for the week ending July 28 
after declining throughout July. 
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loadings, and it is quite possible that the curve of growth during the 
next ten years may be at a less rapid rate than during the past eight. 
This already appears to have had the effect of making the distribution 
curve 4 little too low on the scale during 1926 and 1927. It is intended 
to recompute the secular trend at the end of 1927 to include that year, 
which will probably result in lifting 1926 and 1927 a little higher on the 
scale. 

Another drawback is the periodic recurrence of coal and railroad 
strikes. Every time one of these disturbers makes its appearance, the 
index is temporarily thrown out of balance and the observer must wait 
patiently until normal traffic conditions have been resumed. For- 
tunately such events have become less frequent in late years, and do 
not take place often enough to impair seriously the usefulness of the 
entire index. Another possible disadvantage is that the Annalist 
curve itself is rather heavily weighted with regard to car loadings, and 
the distribution index might not compare quite so closely with some 
of the other standard curves of business. Increasing transportation 
by automobile truck might also be mentioned, although this is a very 
minor factor as yet. 

The method of construction was as follows: The weeks in the year 
were first numbered arbitrarily from 1 to 52; all weeks ending from 
January 3 to January 9, inclusive, were called Week Number 1, and 
soon. In 1919, Week Number 1 ended with January 4; in 1920, with 
January 3; in 1926, with January 9. This means that throughout a 
series of years, Week Number | will not include exactly the same days 
of the month each year; they are close enough together for all practical 
purposes, however, and usually there is a difference of only one day 
between one year and the next. Once in five or six years there appears 
a year with 53 weeks; in this case, the last two weeks of the year were 
thrown together. The next step was to obtain the daily average car 
loadings for each week. This presented no difficulty except in the case 
of holiday weeks, where the Harvard method of weighting such weeks 
was used (See Review of Economic Statistics, October, 1926, p. 175). 
This method took care of the great majority of holiday weeks from 
1919 to 1927 very satisfactorily, and in the few cases where the daily 
average of holiday weeks was still obviously out of line, the average 
of the preceding and following week was used instead. The third step 
was to remove the seasonal variation. As the seasonal movement of 
car loadings is very regular from year to year, the simple method of 
arithmetic averages was used, 1.¢. the eight-year average for each week. 
However, on account of the extreme fluctuations in 1920 and 1922 due 
to strikes, the weeks affected in these years were eliminated in making 
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up the averages. Finally, the long-time trend was removed by the 
method of least squares. 

To sum up, it was felt that if some fairly dependable weekly indicator 
of business were secured, it would permit a more up-to-date analysis 
of business changes. Such an index is found in the weekly adjusted 
curve of car loadings. Taken alone, this can only be called an index 
of distribution; but taken in conjunction with one of the standard 
curves of business activity, it is a valuable indicator of what has been 
going on since the publication of the latest available monthly index 
figure, and affords a guide to present business conditions which may be 
checked up when the monthly index figure appears. There are various 
difficulties with the car loadings index, and it is not claimed to be 
infallible; at the same time, it moves closely with the general business 
curve, with minor interruptions, and its disadvantages are outweighed 
by the fact that it is very nearly up-to-date. 
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PROGRESS OF WORK IN THE CENSUS BUREAU 
THE CENSUS OF RELIGIOUS BODIES: 1926 


The Bureau of the Census has just issued a small bulletin (8 pp.), on “‘The 
Catholic Apostolic Church: Statistics, Denominational History, Doctrine, and 
Organization.” This represents the first fruits of the census of religious bodies, 
which has been in progress for more than a year. Naturally, in the census 
process, the smaller denominations are among the first to come to the top, the 
denomination above mentioned having only 8 church edifices and a total church 
membership of 3,408. It may help to identify it, if we add that it originated in 
England in the first half of the 19th century under the influence of the celebrated 
preacher, Edward Irving, whose followers were known as “Irvingites.”’ 

In all, there are about 200 denominations covered by the census. The 
bulletins, when the series is completed, will be brought together and bound in a 
single octavo volume which, under the guise of a census report, will be virtually 
an encyclopaedia of religious denominations in the United States. 

The census of religious bodies, as its name indicates, is a census of religious 
organizations, rather than a census of the population classified according to 
religious affiliation such as is taken in Canada and a number of other countries. 
The items of information called for on the schedule include number of church 
edifices; membership; value of church property; debt on church property; 
expenditures (distinguishing current expenses from benevolences); Sunday 
schools (number and number of scholars). 

The census of religious bodies is based mainly upon returns secured directly 
from the individual churches or corresponding local units of all denominations. 
As a first step it was necessary to secure up-to-date lists of the local churches in 
more than 209 denominations, the whole number of such churches being about 
225,000. For some denominations fairly satisfactory lists were obtained from 
the yearbooks, or were furnished in other forms by the general denominational 
organization. In other cases extensive correspondence with the secretaries of 
local associations or conferences was required; and in still other cases lists were 
secured by men sent out from the office for that purpose, or through local men 
appointed as special agents of the Bureau. 

Schedules were mailed to the churches on these lists, with a regular series of 
follow-up letters for thosé who failed to respond. Telegrams were used to secure 
reports from the last few churches in many denominations; and other methods, 
including personal work in the field, were employed to complete the work, de- 
nomination by denomination. About 210,000, or approximately 93 per cent_of 
the expected total number of reports, have been received. 

The tabulation of the completed denominations has been started, and pre- 
liminary press statements are being issued by denominations. Each summary, 
before being issued, is submitted to the denominational headquarters for ap- 
proval—a process which delays somewhat the publication of the figures, but 
which forestalls criticism or the expression of dissatisfaction with the figures 


after they have been made public. 
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A CENSUS OF DISTRIBUTION 


The Bureau of the Census has been taking a limited census of distribution, or 
of wholesale and retail trade. There is a great lack of statistical data on that 
subject; and there is a general recognition of the need and importance of obtaining 
such data to supplement the statistics of production obtained through the census 
of manufactures and of agriculture and from other sources. The only question 
has been how far it is practicable to cover the subject of distribution by a census; 
and the work which the Bureau has been doing in that field should be regarded as 
largely experimental, undertaken with a view to ascertaining what is practicable 
and worth while. It is too soon yet for any definite conclusions on that question. 
The compilation of data must first be completed and the results carefully studied. 

The census was first tried out in the city of Baltimore; and as the results seemed 
to justify it, was then extended to a number of other cities, including Atlanta; 
Chicago; Denver; Fargo; Kansas City (Missouri and Kansas); Providence, 
Central Falls, and Pawtucket, Rhode Island; San Francisco, Alameda, Berkeley, 
and Oakland, California; Seattle; Springfield, Illinois; and Syracuse. 

The work was taken up with the support and codperation of the United States 
Chamber of Commerce and of the various local chambers of the cities covered 
by the inquiry. 

The schedule or questionnaire calls for the following items: Number of stores; 
kind of business (such as grocer, hardware, general store, etc.); number of pro- 
prietors and firm members (in case the business is not incorporated) ; number of 
employees, distinguishing between selling and non-selling, and including under 
the latter salaried officers, general managers, and assistants, cashiers, book- 
keepers, delivery men, etc.; total amount paid in salaries and wages; merchandise 
inventory on December 31, and average for the year; net sales (gross less goods 
returned), by kind of commodity. 


FACTORS INFLUENCING THE MOVEMENTS OF 
SECURITY PRICES 


A dinner meeting of the American Statistical Association was held at the 
Aldine Club, 200 Fifth Avenue, New York City, on the evening of May 5, 1927. 
About 170 were present. The meeting was ably presided over by Dr. Frederick 
R. Macaulay of the National Bureau of Economic Research. 

The first speaker of the evening was Mr. Alexander D. Noyes, Financial 
Editor of The New York Times. He pointed out that the stock market commonly 
indicates future conditions in finance and trade because purchases or sales, as 
the case may be, are effected or inspired by individuals in a position to know 
the actual trend of things. Such advance knowledge may have reference to a 
diplomatic breach, to a good or bad turn in the harvests, or to increased or re- 
duced earnings of certain great corporations. Investors who are aware of the 
trends in such directions, or investors and speculators who have the faculty of 
discerning such trends, may shape their operations in the market accordingly. 

On the surface, many past movements of the market have been in this regard 
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inconsistent, yet there was usually a reason for the inconsistency, based on con- 
siderations in the general situation. The outbreak of the Great War in 1914 
caused a violent decline in New York stocks—yet continuance of the war in 
1915 caused a great advance. This discrepancy, however, indicated merely a 
change from the attitude of 1914, which was based upon the fear of a blockade 
of American exports, immense withdrawal of capital from the United States, 
and a breakdown of the American financial structure, to the attitude of 1915, 
which was visibly based upon resumed freedom of export trade, maintenance of 
the gold standard, introduction of the Federal Reserve system, and the resultant 
flow of capital into the United States instead of in the outward direction. 

Similarly, the decline in stocks at the beginning of 1919 was followed, without 
actual change in the general situation, by the great advance at the end of 1919 
and in the first half of 1920. Both movements were, in their way, prophetic, 
but the early decline foreshadowed the deflation movement which came in 1921 
and ignored the intervening movement of credit inflation, whereas the subsequent 
advance of stocks correctly foreshadowed the inflation movement but ignored 
the factors making for speedy deflation. 

Referring to the progressive advance during two years past in prices on the 
Stock Exchange, Mr. Noyes pointed out that, in older times, this would have 
foreshadowed a rise of prices for commodities, and speculative activity in trade, 
as it did in 1901 and 1916, but that, during this recent movement, general trade, 
while expanding, has been conducted with the utmost conservatism and has 
been carried on during a time of gradual but continuous fall in prices. Also, in 
other periods, continued prosperity was deemed to be dependent on prosperous 
agriculture, whereas agricultural depression has, in the present period, become 
even a political consideration. The exjanation of this discrepancy is that stocks 
have risen on the basis of exceptionally abundant credit facilities, low interest 
rates, and confidence in the continuance of both, whereas trade has been pro- 
ceeding under different conditions under which orders are placed only on the 
basis of nearby visible requirements instead of future probabilities. Yet 
aggregate consumption, even on this basis, has been of unprecedented magnitude 
and, up to this time, despite the fall of commodity prices, profits of industrial 
corporations have been exceptionally large. 

Whether, under the impetus of unprecedented pressure of capital on the in- 
vestment markets, an advance of such a scope in stocks can be said to have fore- 
shadowed accurately the trade movement, and whether, if it does not, there may 
not exist certain dangers in the stock market movement itself, is a practical 
question. So far, however, as the visible signs indicate, capital and credit are 
becoming more rather than less abundant, and the movement of American 
trade prosperity is not yet visibly checked. 


The second speaker of the evening was Dr. Lewis H. Haney, Director of the 
Bureau of Business Research of New York University. His remarks dealt 
primarily with methods of forecasting the prices of common stocks. In his 
opinion, it is almost useless to attempt to forecast the action of the market as a 
whole, for, at present, the market is made up of a large number of different groups 
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of stocks each more or less homogeneous in itself. The price-averages for these 
groups do not vibrate in unison. It is necessary, therefore, to study each group 
separately. 

One peculiarity of the stock market is that there is no distinct quantity de- 
manded against a quantity supplied. Buyers and sellers both come from the 
same class; that is, a given man may be a buyer on one day and a seller on tie 
next, or even at the same time. We are forced to deduce indirectly the position 
of demand and supply. 

Stocks are valued not only on account of dividends actually paid but also 
because of additional earnings carried to surplus. The best way, then, to fore- 
cast the movements of stock prices is to ascertain what is happening to earnings. 
The volume of earnings depends upon the differential between cost and the value 
of goods sold. It is often possible, then, by comparing the prices of materials, 
labor, etc., purchased, with the prices of commodities sold, to tell in advance what 
will happen to the earnings of the company. Unfortunately, one is not always 
able to take earnings statements at their face value, for many corporations manip- 
ulate these statements by arbitrarily writing off amounts for various reserves. 
Other companies deceive the public by failing to report the earnings of sub- 
sidiaries. One must also take into consideration the quality of the management 
of a company. 

In two different ways, interest rates play an important part in determining 
the value of all securities. In considering investment securities one should note 
primarily the interest rates on long term loans. In the case of speculative con- 
ditions, however, call and time loan money rates are of greatest importance, for 
they show how much money is available for marginal trading. It is generally 
recognized that low interest rates make for high stock prices, but it is not so 
commonly understood that the fact that interest rates have continued low for a 
considerable time constitutes an additional force acting in the same direction. 
It is much easier to forecast the course of interest rates than to forecast the course 
of stock prices; hence the study of interest rates can be used to tell something 
about what is going to happen to stock prices. It is worth while to study the 
ratio of short-time interest rates to the total value of stocks traded in. 

A factor which should be taken into consideration in the study of the securities 
of any company is the amount of capital and readily marketable assets on hand. 
The company well supplied with liquid funds is not likely soon to discontinue 
paying dividends, and this fact tends to place its stock in a strong position. 

Other forces which ought to be taken into consideration in forecasting move- 
ments of stock prices are the extent of retail trade, the volume of unfilled orders, 
and the total value of goods sold. The last factor, combined with an index of 
interest rates, is one of the best criteria of the future course of stock prices. 
When stock prices rise, but the volume of trade in stocks remains stationary 
or declines, a fall in the prices usually occurs within a short time. 

In considering current conditions in the stock market, Dr. Haney pointed out 
the existence of a very large supply of capital equipment. Such a supply means 
poor returns for marginal equipment and puts marginal concerns in a weak 
position. A counteracting force is the fact that the management of the stronger 
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concerns has become more conservative and profits have apparently been 
stabilized to such an extent that many stocks formerly regarded as speculative 
have now been brought into the investment class. As a result, the ratio of the 
prices of stocks to dividend yields has been growing steadily during the last 
few years. 


The third speaker of the evening was Mr. Barnabas Bryan, Petroleum Econo- 
mist. He dealt with the situation in the oil industry. He believed this industry 
to be suffering from the effects of too much efficiency in production. Several 
years ago, it was felt that a shortage of gasoline would soon appear, but the out- 
put of this commodity has been doubled by the cracking process, resulting in a 
great over-supply. The effect of the present high production is to eliminate all 
profits in the case of many of the smaller companies. Highly integrated con- 
cerns are still, however, in a safe position. 

Over-production is the necessary outcome of the existing method of handling 
new oil fields. Whenever a well is drilled into an oil pool, it tends to drain the 
oil from the surrounding territory and, hence, forces all other owners of any part 
of the pool to drill at once and extract oil as rapidly as possible, otherwise they 
will soon find their holdings valueless. 

The thing most needed in the oil industry is a combination of all the producers 
in a given locality in order to prevent the present wasteful methods of producing 
oil—methods which must inevitably squander the Nation’s resources and also 
bring financial disaster to the concerns engaged in the production. The oil 
companies are extremely fearful of violating the provisions of the Sherman 
Anti-Trust Law for they still have in mind the action taken in regard to the 
Standard Oil Company a number of years ago. There seems to be little hope of 
checking the great over-supply of oil until the oil companies stop misunder- 
standing the Anti-Trust Law—a law which does not actually forbid coéperation 
to prevent over-production. 


Mr. Jacques Cohen, a member of the New York Stock Exchange firm of Baar 
Cohen and Company was the fourth speaker. His talk dealt not with the con- 
ditions determining the trend of the stock market but rather with those respon- 
sible for the minor fluctuations. In his opinion these minor fluctuations are 
mainly due to three general causes: (1) Technical conditions, (2) general psy- 
chology, and (3) manipulation. 

To illustrate what is meant by ‘technical conditions,” Mr. Cohen pointed out 
that, whenever an advance in the price of a stock is more rapid than is warranted 
by the improvement in the condition of the company at the time, the technical 
condition of the stock becomes weak. Under such circumstances, one can look 
for a sharp reaction at any time. 

Numerous factors influence the psychology of those engaged in stock specula- 
tion. World events of every possible variety may play a part. Frequently 
advice given in market letters or in the financial columns of the newspapers has 
great influence. If, for example, it is observed that the writer of some market 
letter has been particularly successful in forecasting the advance of particular 
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stocks, there is a strong tendency for people to buy those stocks whenever he 
predicts that they will advance and this brings about the advance which he has 
predicted. This situation may continue for some time until he makes several 
bad mistakes when some other leader is chosen. 

Frequently some leading speculator, for some trivial reason, may decide to buy 
a considerable block of some given stock. His purchase leads not only to other 
purchases of the same stock, but also to purchases of stock in similar companies. 
Soon the entire group of stocks advances. Then, all kinds of fantastic reasons 
are assigned for the advance in this group of stocks. 

There are certain psychological laws which pertain to nearly all speculators. 
One of these is reluctance to take a loss. Very few speculators will sell a stock 
on a decline, no matter how bad the condition of the company looks, unless they 
can get as much as they paid for the stocks. They refuse to recognize the fact 
that the past has nothing to do with prospective profits. 

As a rule, a marked improvement in the condition of a company is followed 
shortly by a rise in the value of its stock. This condition is brought about by 
the fact that those on the inside are aware of the improving condition of the com- 
pany and begin to purchase its stock. Almost always they pass along the good 
news to their friends and these individuals also buy, thus forcing the price of the 
stock up. 

From time immemorial, various reformers have campaigned against short 
selling, their theory being that such action tends to depress the market and 
diminish values. This belief has no foundation in fact, for, of course, all persons 
who sell short must later buy again. What short selling really does is to lessen 
the violence of the fluctuations in the market. 

The meeting adjourned. 











American Statistical Association 


MISCELLANEOUS NOTES 


The Forthcoming Annual Meeting.—President Edmund E. Day announces that 
the next annual meeting of the American Statistical Association is to be held from 
Tuesday morning, December 27, to Thursday afternoon, December 29, at Wash- 
ington, D.C. Details of the program are not yet available. 


United States Bureau of Labor Statistics.—Since the last quarterly report the 
Bureau of Labor Statistics has started the following wage investigations: Wages and 
hours of labor in foundries and machine shops; wages and hours of labor in the 
manufacture of aluminum, brass, and copper wares; and wages and hours of labor in 
the manufacture of electrical appliances. A study of productivity in the woolen 
industry in European countries is being made by Charles E. Baldwin, Assistant 
Commissioner, who is now in Europe. 


Cleveland Chapter Meeting.—The annual dinner meeting of the Cleveland Chapter 
of the American Statistical Association was held on Wednesday evening, April 13, 
at the High Noon Club. There were sixty members and guests present at dinner, 
and they accorded the program which followed a very cordial reception. 


Mr. Harry A. Wembridge, of the Joseph and Feiss Company, presided at the 
meeting, and the speakers and their subjects were as follows: 


Dr. G. E. Harmon, of Western Reserve University, “‘Personal Impressions of 
Karl Pearson and his Laboratory.” 

Mr. Bradford B. Smith, of the White Motor Company, “Correlation Coucepts.”’ 

Mr. H. W. Green, of the Cleveland Health Council, “Population Analysis by 
Census Tracts.” 

Colonel Leonard P. Ayres, of the Cleveland Trust Company, “ Prosperity.”’ 


Dr. Harmon, who is professor of hygiene and bacteriology at Western Reserve 
University, has just completed a year’s work under Professor Karl Pearson, and he 
gave his impressions of Karl Pearson as well as his impressions of the latter’s labora- 
tory and how his research is conducted. Dr. Harmon told also of some of the 
problems with which Professor Pearson is at present concerned. 

Mr. Smith followed with a fascinating discussion of “Correlation Concepts” in 
which he showed, by making use of some commonplace yet unique illustrations, 
how correlation relationships could be pictured graphically. 

The third speaker, Mr. Green, presented the results of his excellent study of Cleve- 
land’s population in 1910 and 1920, in which he made use of Bureau of Census data 
arranged by census tracts. The study included not only the usual facts about color, 
parentage and citizenship but introduced as well such information as illiteracy, 
home-ownership and church affiliation. 

The program closed with a talk by Colonel Ayres, who explained how the old 
concept of the business cycle was undergoing a radical change and that the most 
radical change was occurring in the conception of the prosperity phase of this cycle. 
Heretofore we have always experienced a rising price level in a period of prosperity, 
but the falling price level of today has necessitated drastic reductions in the costs of 
operation of business enterprises. Competition has become more intense and great 
mergers and consolidations have been effected. The result has been that business 
activity has gone to greater heights than ever before experienced in this country. 
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Colonel Ayres, urged therefore, that all statisticians think clearly on the idea of this 
present-day prosperity, and he urged further that they be extremely careful when 
they came to defining and measuring this same prosperity. 


Meeting of the San Francisco Chapter.—The third dinner meeting of the San 
Francisco Chapter of the American Statistical Association was held at the Com- 
mercial Club, San Francisco, on Thursday, April 7, 1927. 

The constitution and by-laws were adopted after a discussion of the report of the 
organization committee. 

The topic for the evening was “‘Seasonable Fluctuations in Industry and Business.” 

The first speaker, Louis Bloch, Statistician of the California Bureau of Labor 
Statistics, spoke of the seasonal fluctuations in employment of labor in the various 
industries and cited examples from surveys made for the State Bureau. He stressed 
the fact that the seasonal peak could be eliminated from most operations through 
proper scheduling and planning of production throughout the year, and showed the 
great improvement which had taken place in the motion picture industry through 
planning and coérdination. The partial employment situation in the coal industry 
throughout the Nation was cited as one which could be controlled with benefit to 
producer and consumer. The problem of the California migratory laborer whose 
movements followed the ripening of the crops was discussed. Its hardships and 
dangers to the migrant, the community, and the employer were pointed out; as for 
example, when due to weather conditions crops in certain areas ripened before their 
usual sequence, it was impossible to secure migratory labor to harvest them. 

Joseph 8. Davis, Director of the Food Research Institute, Stanford University, 
was the second speaker. He discussed aspects of certain technical methods for 
measuring seasonal variation, with examples from the baking industry. He criti- 
cized current definitions of the seasonal factor and stressed the distinction between 
seasonal fluctuations and recurrent fluctuations. He demonstrated the different 
results obtained by applying several of the standard methods for measuring seasonal, 
to the same series of data. These methods were then applied to the entire series and 
to normal and abnormal periods of the series. After a consideration of analysis of 
past data, the discussion centered on the best method to use in forecasting. 

Before adjournment a nominations committee was appointed, to report at the 
next meeting. 


Summary of Activities of the Denver Branch.—During the past year the Denver 
branch of the American Statistical Association has held fourteen meetings with an 
attendance ranging from ten to forty-six. The discussions dealt with such subjects 
as ‘The Use of Statistics in Management of the Great Western Sugar Company,” 
“The Use of Charting in Industrial Engineering,’ ‘The Development of the Co- 
operative Movement,” ‘Locating Earthquakes through the Seismograph,” ‘The 
Development of Colorado’s Resources,” ‘Retail Cost Studies,’ ‘The Use of Sta- 
tistics in Marketing,” ‘‘ Philosophy and the Scientific Method,” “Some Technical 
Problems in Industrial Research.” 


Spring Meetings of the Ohio Conference of Statisticians.—The spring meetings 
of the Ohio Conference of Statisticians were held at the Ohio State University in 
Columbus on April 16. The programs were developed jointly by the Business 
Statistics and Educational Statistics sections. 

The morning meeting was a joint session of the two sections. Herbert A. Toops 
of the Department of Psychology, Ohio State University, read a paper on “‘ Methods 
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of Checking the Accuracy of Data.” H. 5S. Will, Vice-President of the Conference, 
read a paper on ‘“‘The Predictive Value of Approximation Equations.’”’ Ralph J. 
Watkins, Bureau of Business Research, Ohio State University, read a paper on “A 
Non-Technical Approach to Correlation in Measuring Economic Relationships.”’ 

The luncheon meeting was held at the Faculty Club, and Dr. Harry Myers, 
Director of Industrial Relations, Delco-Light, Frigidaire Unit, gave a spirited ad- 
dress on ‘‘The Human Factor in the Execution of a Budgetary Program.” 

For the afternoon session the two sections of the Conference held separate programs. 
The program of the Business Statistics section was devoted to questions of production 
control. W. F. Bloor, Goodyear Tire & Rubber Company, read a paper on “‘The 
Coérdination of Sales, Production, and Inventories in the Preparation of a Production 
Program.’’ Charles H. Chase, Department of Extension, Ohio State University, 
gave a discussion on the subject of ‘‘ Administrative Relationships Involved in the 
Preparation of a Sales and Production Program.” ‘“‘Budgeting of Labor in the 
Construction and Operation of a Production Program’ was discussed by Willis 
Wissler, Bureau of Business Research, Ohio State University. 

The Educational Statistics section was devoted to statistical problems in educa- 
tion. B. R. Buckingham, Director of the Bureau of Educational Research, Ohio 
State University, lead the program with a paper on “Statistical Thinking.” ‘‘Edu- 
cational Statistics as Seen from the Editor’s Chair” was the subject discussed by 
E. 8. Ashbaugh, Assistant Director of the Bureau of Educational Research, Ohio 
State University. H. A. Edgerton, Department of Psychology, Ohio State Uni- 
versity, read a paper on “‘ What Tests Best Predict Success in the Various Colleges of 
Ohio State University?” 

The dinner meeting was in charge of the Business Statistics section and consisted 
of a symposium on the current trend of business. Louis H. Bean, Bureau of Agri- 
cultural Economics, Washington, D. C., discussed the outlook for agriculture. 
J. W. Hill of the Iron Trade Review, Cleveland, outlined the present trend in the 
iron and steel industry. The outlook for the automobile industry was discussed by 
John W. Scoville of the Chrysler Motor Corporation. E.O. Merchant of the J. H. 
Meade Company, Dayton, and retiring President of the Conference, surveyed the 
situation in the paper industry. The outlook for the stock market was presented by 
Ray Vance of the Brookmire Economic Service. 

It is expected that some of the papers will be printed in the Proceedings to be 
published by the Bureau of Business Research, Ohio State University. 


The Advisory Committee to the Director of the Census.—The Advisory Committee 
from the American Statistical and American Economic Associations to the Director 
of the Census met with the Director at his request for a two-day conference on June 
17 and 18. Those present were Mr. Rossiter and Mr. Andrew of the Statistical 
Association, and Professor Willcox, Professor Young and Professor Warren of the 
American Economic Association, the remaining member from the Statistical Asso- 
ciation, Professor Chaddock, being abroad at this time. 

The meeting discussed with the Director the various current problems of the 
Census, in accordance with the general policy of committee meetings, acting some- 
what in the capacity of a board of expert advisers or, in the industrial world, of a 
board of directors. Not only were current subjects discussed, but also the plans 
which the Director is maturing for the mechanical and other details in connection 
with the Fifteenth Census. 

Professor Allyn A. Young, who has been a member of the Advisory Committee 
during practically all of its existence since 1918, attended this meeting for the last 
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time, as he is preparing to leave for England on August 13 for a three-year period of 
lecturing at an English university. The departure of Professor Young means a 
serious loss to the Advisory Committee, as his knowledge of Census problems and his 
extremely ripe judgment have made him an important factor in the deliberations and 
conclusions of the Committee. 


The Encyclopedia of the Social Sciences.—Professor Wesley C. Mitchell has been 
appointed to represent the American Statistical Association on the Board of Directors 
of the Encyclopedia of the Social Sciences. 


A Committee to Study Linguistic Stocks——The American Council of Learned 
Societies has appointed a committee to study the national or linguistic stocks in the 
white population of the United States with especial reference to their number at the 
beginning of our national history. Different students have reached divergent results 
on this problem, and it is hoped that a careful study may bring these conclusions into 
closer agreement. The members of the committee are: R. H. Fife, M. L. Hansen, 
J. A. Hill, J. F. Jameson and W. F. Willcox, chairman. Those familiar with the 
previous discussion of the question will note that the committee fairly represents the 
divergent views hitherto expressed on the problem. A grant of $10,000 has been 
obtained for the work of the committee which is likely to begin in the near future. 


The Third International Congress of Scientific Management.—The Congress was 
held in Rome, September 5-10, and was followed by a visit to the most important 
industrial towns of Italy. His Excellency, Prime Minister Mussolini, was President 
of the Honorary Committee. Senator Luigi Luiggi was President of the Congress, 
which was organized by the special Department of Scientific Management, created 
by the General Federation of Italian Industry and other associations. 

The Congress undertook discussion of many important problems related to the 
efficiency of industry, agriculture, and the public services, and included the standard- 
ization of several types of wares and other industrial products, the concentration of 
industry, the scientific investigation of the skill of workers and their vocational 
efficiency, etc. 


The Editor wishes to call attention to the private publication of two of the papers 
delivered before the Eighty-eighth Annual Meeting of the American Statistical 
Association at St. Louis, which, owing to the limitations of JourRNAt space, could not 
receive prompt publication in our pages. They are: ‘“ Building Contracts and Busi- 
ness Movements,’’ by Thomas S. Holden, Vice-President in Charge of Statistical 
Division, F. W. Dodge Corporation, 119 West 40 Street, New York City; and “‘The 
Problem of Analyzing Local Business Conditions,’ by John R. Riggleman, University 
of California at Los Angeles and Eberle and Riggleman, Inc., 810 South Spring Street, 
Los Angeles, California. Members desiring copies should communicate with the 
authors. 


A Barometer of Industrial Stock Prices is being published periodically by Sil- 
berling and Schaffer, 2163 Center Street, Berkeley, California, a description of it 
appearing in the February 5 issue of their Business Report. 


PERSONAL NOTES 


Dr. William A. Berridge has resigned as Director of the Bureau of Business Re- 
search and Associate Professor of Economics at Brown University, to become Econ- 
omist at the Metropolitan Life Insurance Company, 1 Madison Avenue, New York. 
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Announcement has been made that Professor John H. Cover, of the University of 
Denver and Director of the Bureau of Statistical Research of that Institution, has 
resigned to accept the professorship of statistics at the University of Pittsburgh and 
the Directorship of the Bureau of Business Research there. 


Miss Frances Brooks has resigned from the Department of Statistics in the Russell 
Sage Foundation to join the staff of the Karsten Statistical Laboratory, New Haven, 
Connecticut. Her place in the Russell Sage Foundation has been taken by Miss 
Margaret H. Hogg, recently of the Department of Economics at Smith College, 
formerly Research Assistant in Statistics in the London School of Economics and 


Political Science. 


Professor M. Palyi, who has been connected with the University of Chicago, has 
returned to his residential college, the Berlin Handelshochschule. 


OBITUARY NOTES 


Mr. Rudolph Diamant, formerly economist and statistician of the Prudential 
Insurance Company, died in April. 

Mr. William S. Haimes of Washington, D. C., died in May. 

Mr. Arthur C. Keller, born in St. Gall, Switzerland, in 1881, died in Milwaukee, 
Wisconsin, in May. He received his training in electrical engineering in the leading 
technical schools of Switzerland and ranked high in that profession in this country. 

Mr. William Henry Porter, prominent in banking circles in New York City since 
1878, died November, 1926. He was a member of the firm of J. P. Morgan and 
Company, ex-president of the Chemical National Bank and of the New York Clearing 
House Association, director of various banks, and trustee of several educational, 
scientific and philanthropic organizations. 

Mr. Charles J. Stevenot of Brooklyn, New York, who joined the Association in 
1925, died in January. 


MEMBERS ADDED SINCE JUNE, 1927 


Adams, Edward, Jr., Student, Rutgers University, New Brunswick, N. J. 

Adlerblum, israel S., Metropolitan Life Insurance Company, New York, N. Y. 

Baker, B. E., Southern Bell Telephone and Telegraph Company, Atlanta, Ga. 

Bauman, Adolph O., 418 Broadway, New York, N. Y. 

Beatty, Willard Chrisler,; Assistant Professor of Economics and Social Service, 
Wesleyan University, Middletown, Conn. 

Benge, Eugene, Jr., Mitten Management, Inc., Philadelphia, Pa. 

Boardman, Elliott 8., Research Department, Pacific Mills, Boston, Mass. 

Bulla, Beatrice, Service Department, National Bank of Commerce, New York, N. Y. 

Cairns, Dr. Andrew, Director, Department of Education, Alberta Wheat Pool, 
Calgary, Canada. 

Carpenter, William M., National Electric Light Association, 29 West 39 Street, 
New York, N. Y. 

Cates, Louis H., 642 East 2 Street, Brooklyn, N. Y. 

Chew, Fred W., Assistant Professor of Insurance and Assistant Director of Bureau of 

Business Research, Indiana University, Bloomington, Ind. 
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Childs, Edward B., 35 Mt. Vernon Street, Haverhill, Mass. 
Clague, Ewan, Bureau of Labor Statistics, Washington, D. C. 
Clendenin, J. C., Instructor, University of California, Los Angeles, Calif. 
Compton, Ralph T., Instructor, Department of Economics, Yale University, New 
Haven, Conn. 
Conovich, Alexander T., 37 Wall Street, New York, N. Y. 
Craig, Douglas S., Metropolitan Life Insurance Company, 1 Madison Avenue, New 
York, N. Y. 
De Puy, Charles T., Stromberg Carlson Telephone Manufacturing Company, 
Rochester, N. Y. 
Dill, Frank R., Statistical Department, Cleveland Trust Company, Cleveland, Ohio. 
Dittmer, Dr. Clarence G., Professor of Sociology, New York University, New York, 
N. Y. 
Doan, Robert R., Mid-Manhattan Survey Committee, 17 East 42 Street, New York, 
N. Y. 
Dow, Everett D., Thompson Fenn and Company, 56 Pearl Street, Hartford, Conn. 
Duncan, William A., Student, University of Missouri, Columbia, Mo. 
Dunham, Carroll, 3rd, Wood Low and Company, New York, N. Y. 
Eddy, Herbert G., Telephone Engineer, 463 West Street, New York, N. Y. 
Ely, Charles J., Carr Brothers, Inc., West Palm Beach, Fla. 
Flinn, Helen Louise, Psychopathic Clinic, Recorder’s Court, Detroit, Mich. 
Forsyth, Dr. C. H., Instructor, Dartmouth College, Hanover, N. H. 
Fratkin, James A., Statistician, Businessmen’s Committee on Agriculture, 247 Park 
Avenue, New York, N. Y. 
Fuller, Millard F., Student, Harvard Business School, Boston, Mass. 
Genzmer, Frederic C., Instructor, Wagner College, Staten Island, N. Y. 
Glover, Charles A., Assistant Professor of Economics, College of Business Administra- 
tion, Lehigh University, Bethlehem, Pa. 
Harter, George A., Professor of Mathematics, University of Delaware, Newark, Del. 
Haydon, George F., Workmen’s Compensation Insurance Bureau, 481 Broadway, 
Milwaukee, Wis. 
Hayes, Dr. Joseph W., The Crowell Publishing Company, 250 Park Avenue, New 
York, N. Y. 
Hoffer, Irwin 8., Graduate Student, Harvard University, Cambridge, Mass. 
Hoisington, F. R., Jr., Statistician, Western Electric Company, 195 Broadway, New 
York, N. Y. 
Holmes, Bert E., American Telephone and Telegraph Company, 195 Broadway, New 
York, N. Y. 
Hopkins, Dr. John A., Jr., Agricultural Economics Department, Iowa State College, 
Ames, Ia. 
Jappe, Kurt W., Consulting Engineer, 45 East 55 Street, New York, N. Y. 
Kaiser, Albert L., Statistician, W. A. Harriman and Company, Inc., 26 Broadway, 
New York, N. Y. 
Keays, Eldred M., Second Ward Securities Company, Milwaukee, Wis. 
Kilgore, Elizabeth S., Statistician, War Department, Washington, D. C. 
Lahee, Arnold W., Statistician, 72 High Street, Glen Ridge, N. J. 
Leaming, George C., Student, Rutgers University, New Brunswick, N. J. 
Lo, Tai Lai, 526 West 123 Street, New York, N. Y. 
Lombard, Norman, Stable Money Association, 104 Fifth Avenue, New York, N. Y. 
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McMullen, Joseph H., Taylor Thorne and Company, New York, N. Y. 

McPherson, H. M., National City Bank, 55 Wall Street, New York, N. Y. 

Manes, Dr. Julius H., Statistician, 338 West 77 Street, New York, N. Y. 

Melander, Alf, Statistician, The Equitable Life Assurance Society of the United 
States, 393 Seventh Avenue, New York, N. Y. 

Meyerson, Alexander W., Standard Department, Bernard Hevitt Company, 812 
Jackson Boulevard, Chicago, IIl. 

Milbank, J. Hungerford, Lecturer in History, 252 Merrick Road, W., Freeport, 
Long Island, N. Y. 

Miller, Bessie I., Professor of Mathematics, Rockford College, Rockford, Ill. 

Mints, Lloyd W., Instructor of Economics, University of Chicago, Chicago, IIl. 

Miyajima, Tsunao, Instructor, Statistics and Economics, Kansai University, Osaka, 
Japan. 

Monroe, F. Adair, Jr., Statistician, Cuban-American Sugar Company, 136 Front 
Street, New York, N. Y. 

Muller, Dr. Oscar R., 100 Broadway, New York, N. Y. 

Newman, J. K., Jr., 60 Broadway, New York, N. Y. 

Oberhumer, Ernst, Statistician, New York Stock Exchange Company, 149 Broadway, 
New York, N. Y. 

Paterson, Ellsworth G. D., Bell Telephone Laboratories, 463 West Street, New York, 
a 

Phelps, Harold A., Assistant Professor, Brown University, Providence, R. I. 

Putnam, Elizabeth W., Statistician, Bankers Trust Company, 16 Wall Street, New 
York, N. Y. 

Quinlan, Joseph P., Department of Commerce, Washington, D. C. 

Rabe, Olive H., Lawyer, 11 South La Salle Street, Chicago, IIl. 

Raley, P. H., Standard Sanitary Mfg. Company, Pittsburgh, Pa. 

Reid, Gertrude, Family Welfare Association, 31 South Calvert Street, Baltimore, Md. 

Reiner, James, 262 Fulton Street, Brooklyn, N. Y. 

Richardson, Howard B., United States Department of Agriculture, Bureau of 
Agricultural Economics, Washington, D. C. 

Rorem, C. Rufus, University of Chicago, Chicago, IIl. 

Rumpen, Herman, 25 Broad Street, New York, N. Y. 

Sass, Hugo V., Assistant Treasurer, New York and Porto Rico Steamship Company, 
25 Broadway, New York, N. Y. 

Scammon, Richard E., Professor of Anatomy, University of Minnesota, Minneapolis, 
Minn. 

Schanen, Paul, Bell Telephone Company of Pennsylvania, 1835 Arch Street, Phila- 
delphia, Pa. 
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REVIEWS 


Der Getreideverkehr der Welt vor und nach dem Kriege, by Kurt Ritter. Berlin: 
Paul Parey. 1926. 343 pp. 


The author of this analysis of the grain trade of the world before and after the 
war is a member of the Agricultural Institute at Berlin and the author of a series 
of books dealing with the economic problems of agriculture. He began the pres- 
ent study with the idea of explaining the changes which have taken place in the 
paths followed by the international grain trade since the great war. For this 
purpose he compiled figures on numbers of cattle, grain production and interna- 
tional trade in grain for every country in the world for the years 1911, 1912, 1913 
and 1920, 1921, 1922 in so far as such figures were available. The book has 
evidently been written with painstaking care, and it contains a great deal of 
interesting material. 

The study of necessity contains a large number of statistical tables, which are 
indexed by countries and are therefore made available for the reader who does 
not wish to go through all the details which have been assembled, but who is 
interested in the trade of some one or two countries. The tables frequently fail 
in clearness because material necessary for understanding of the figures which 
they present has been scattered through preceding paragraphs and does not 
appear in the body of the table. 

Summaries of average imports and exports before and after the war, for each 
grain, and for bread grains and fodder grains, and then of all grains are presented 
in the form of bar diagrams. The classification of bread grains and fodder grains 
seems a doubtful one, as the use to which any of the grains are put by men de- 
pends in an important degree upon where and who the men are. The graphs are 
less helpful than they might otherwise have been, because the breadth of the bars 
as well as their length varies from one section to another and from one period to 
another in the same graph. It is practically impossible for the eye to estimate 
the relationship between the bars. 

American readers who are interested in the problems of wheat economics will 
find the wheat studies of the Food Research Institute at Stanford University 
more useful than Dr. Ritter’s study, for the reason that his statistical data do not 
cover the important developments of the last four years. Students particularly 
interested in the trade in other grains will be obliged to supplement his figures 
with more recent ones from such publications as the yearbook of the International 
Institute of Agriculture. 

For the present reviewer the most interesting part of the book is Dr. Ritter’s 
criticism of the presentation and the accuracy of statistics on the international 
grain trade. His criticism of statistical presentation applies especially to the 
yearbook of the International Institute of Agriculture at Rome; his criticism of 
completeness and accuracy applies to the foreign trade statistics of most of the 
countries important from the point of view of the grain trade. 
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Readers familiar with the yearbook of the International Institute of Agricul- 
ture will remember that its summaries of imports and exports of grain are pre- 
sented first by continents and then by hemispheres. Dr. Ritter points out that 
because two countries are situated in the same continent does not mean that they 
both produce a surplus of grain and that a summary by continents tells very little 
about economic relationships. The summary by hemispheres is equally useless, 
if not more so. Dr. Ritter’s own method of summarizing the import and export 
figures on grain is to group them under eleven headings: Deficiency Areas in 
Europe; Surplus Areas in Europe; the Levant; Cyrene, Tripoli, Tunis, Algiers and 
Morocco (grouped under the German term “Atlaslinder,” a name infinitely 
more picturesque than most geographical terms; Other African Countries; North 
America; Central America; South America; Oceania; East Asia; and British 
India. In spite of changes in boundary lines since the war he has been able to 
include approximately the same areas in each group before and after the war. 
The “surplus section” of pre-war Europe includes Russia (without Finland), 
Roumania, Bulgaria, Servia and Hungary and, as far as it was possible to pro- 
cure separate statistics, Bosnia and Herzegovina, and the post-war “surplus 
section” approximately the same area. 

His summaries are much more significant than the summaries compiled by 
continents and hemispheres, but it seems to me that they would be more 
valuable if they were not so much condensed and if they gave under each heading 
separate statistics, at least for the more important countries included in the 
group. Exports and imports from any given country vary greatly from year 
to year. In the case of Russia, for instance, estimates recently published in the 
Wheat Studies of the Food Research Institute indicate that Russia will have an 
exportable surplus of 40 million bushels of wheat ' alone in 126-27 (as contrasted 
with exports of about 165 million bushels before the war), but intheseason 1924— 
25 Russian imports of wheat probably amounted to 10 million bushels.2. These 
variations are clearly shown in such tables as those published in Broomhall’s 
Corn Trade Year Book, which use no classification except a division into import- 
ing and exporting countries and which give separate statistics for each important 
country. A table showing foreign trade in wheat for the three years 1923-24, 
1924-25, and 1925-26 in the March, 1926, edition of that Year Book includes 
Russia among both the importing and exporting countries showing a forecast of 
exports amounting to 3 million quarters for 1925-26 and actual exports of about 
the same amount in 1923-24 and actual imports of 2 million quarters in 1924-25. 
Either the Broomhall or the Ritter methods seem to furnish a more useful type of 
summary than that employed by the Agricultural Institute. 

Dr. Ritter’s criticism of the statistics which he had to use in his study reveals a 
more fundamental difficulty than does his discussion of various types of summary. 
In studying the movements of grain between different countries he found differ- 
ences in the figures applying to the same movement of goods in the statistics of 
the two countries concerned, so serious that in many cases it was actually impos- 

1 Wheat Studies, Vol. III, No. 3, p. 169. 


* Ibid., Vol. II, p. 13. 
* Broomhall’s Corn Trade Year Book, March, 1926, p. 23. 
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sible to tell what the trade had been. The differences in methodology in compil- 
ing foreign trade statistics which he encountered have been set forth in part in a 
document published by the League of Nations. The confusion which results 
from these diverse methods is very well illustrated by Dr. Ritter’s effort to disen- 
tangle the facts as regards the grain trade. In attempting to arrive at accurate 
figures he has compiled a series of tables each of which gives figures for the trade 
in a given grain from an exporting country to the various countries which receive 
it for six years, and in addition the statistics of all the receiving countries on 
imports of the same grain from the first country, for the same years. In a large 
proportion of the cases he has found enough discrepancies to provide the reader 
with a healthy skepticism as to the validity of any conclusions drawn from foreign 
trade statistics which have not been submitted to a similar critical examination. 
Take the case of Italian imports of wheat from the United States in 1922. The 
foreign trade figures for the United States show an export of 7,905,000 dz.* to 
Italy in that year. Italian statistics show an importation from the United States 
of 17,659,000 dz. of wheat in the same period.* In that case it seems likely that 
the Italian figure is more nearly accurate than the American; Italy probably re- 
ceived American wheat shipped in the first place to Gibraltar and then sent on to 
Italy; in other words, the importer was probably right. In other cases, as when 
German trade statistics in 1911 showed an import of 10,000 dz. of wheat from the 
Netherlands while Dutch statistics showed exports (not reéxports) of 12,064,000 
dz. of wheat from the Netherland to Germany, the error was doubtless with the 
exporting, or rather the reéxporting country.‘ (Dr. Ritter points out that there 
was a reorganization of the Netherlands statistical practice in 1917, and that 
such discrepancies now occur less frequently than formerly.) The fundamental 
difficulty involved in obtaining accurate information in the exporting country in 
regard to the final destination of its wares is strikingly brought out in the figures 
on the amount of grain shipped from Argentina, Australia, Canada, and the 
United States subject to further order. Grain from the United States goes to 
Gibraltar ‘on order” for Mediterranean countries. The figure for 1925 given in 
statistics of the Foreign Commerce of the United States for our exports of wheat 
to Gibraltar is 520,000 bushels. Dr. Ritter suggests that it might be required 
that the captain of a boat laden with grain shipped “on order” to such places as 
Gibraltar or the Canaries should report the final destination of his cargo, as far as 
he knows it, to the statistical office of the country where he took it on board. 
He further suggests the advisability of holding an international conference for 
the purpose of determining common methods of recording imports for con- 
sumption, goods in transit, reéxports of ‘“nationalized’”’ goods, the statement 
of country of origin, country of destination, and soon. If foreign trade statistics 
are compiled for the purpose of promoting a better understanding of the eco- 
nomic interrelationships between countries it would seem highly desirable that 
such a conference should be held with power to act, and that the United States 


1 Memorandum on Balance of Payments and Foreign Trade Balances, Part II, Foreign Trade Statistics. 
21 Doppelzenter = 100 kilos. 

*P. 307. 
4P. 311, 
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should send an active participant. Developments of the last few years in this 
country seem to indicate that there is still a good deal to be learned here 
about international trade in general, and the grain trade in particular. 

Fairs M. WILLIAMS 


Cornell University 


Direct Method of Determining Cyclical Fluctuations of Economic Data, by Martin 
Allen Brumbaugh. New York: Prentice-Hall, Inc. 1926. 73 pp. 


Statistical study of time series has largely developed along two rather distinct 
lines. The first of these centers on the analysis of the specific relations between 
several series. It follows the scientific method of approach, setting up tentative 
hypotheses as to cause and effect, and then using the statistical analysis to dis- 
prove, or to verify and measure, the expected relations. Developing from the 
pioneer work of Hooker ! in England and of Moore? in America, studies of this 
type have largely confined themselves to given specific problems, such as the 
analysis of factors affecting potato prices, for example.* 

The second line of development has placed emphasis more largely upon the 
characteristics of individual series of data rather than upon the relationships of 
series. So as to show more clearly the response of the individual series to under- 
lying or recurring economic fluctuations, methods have been developed for 
eliminating the seasonal variation as well as the secular changes from the basic 
data. As evolved by Persons‘ and other students of the business cycle, these 
methods have been applied more generally to descriptions of general economic 
conditions than to quantitative analysis of the causes of variation in specific 
cases. Both methods presumably have the same objective—forecasting future 
changes—but the first method treats of specific changes, and measures their 
causes, whereas the second works largely in terms of long swings and general 
movements. 

There is a considerable degree of overlapping between the first approach and 
the second, and to a certain extent the same methods are employed by both. 
Yet there is the distinctive difference in that one takes the relations between 
series as the thing to be determined, and subordinates such corrections as are 
made to this; whereas the second takes the description of the periodic or recurring 
swings as the primary objective, and thereby makes the refinement of the data a 
necessary and important first step. 

The book by Dr. Brumbaugh is a study in methodology in the second of these 
fields. He attacks the problem of removing trend and seasonal elements from 
time series, and presents a method by which it is possible to obtain a measure of 
cyclical changes at a great saving of time, and without going through the laborious 
processes of first determining trend and seasonal. 


1R. H. Hooker, “Correlation of Marriage-Rate with Trade,”” Journal of the Royal Statistical Society, 
Vol. 64, 1901, pp. 485-492; “On the Correlation of Successive Observations,” Vol. 68, 1905, pp. 696-703. 

? Henry L. Moore, Economic Cycles: Their Law and Cause, 1914, pp. 62-103. 

* Holbrook Working, “Factors Determining the Price of Potatoes in St. Paul and Minneapolis,’ 
Minnesota, Agricultural Experiment Station Technical Bulletin 10, 1922. 

‘W. M. Persons, Articles in many issues of the Review of Economic Statistics, and elsewhere. 1919 
and subsequently. 
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In order to do this he presents a new measurement of periodic changes, some- 
what allied to recent developments in the treatment of “cycle” data. These new 
developments have regarded time series as mathematical curves, and computed 
integral! or derivative ? curves from them, so as to stress either accumulative 
changes or the rate of change. Dr. Brumbaugh, on the other hand, measures 
what he terms “relative cyclical differences.” 

These he intends to be the percentage change over a year’s interval, with trend 
and seasonal removed. (The adequacy of the correction will be discussed later.) 
His measure is thus intermediate between showing the periodic movement in 
the original data and showing the derivative of that movement—the rate at 
which it is changing at any time. It purports to show for monthly data the 
rate at which the basic cycle component (if any) changes between correspond- 
ing months of successive years. 

The method itself is quite simple. In the words of the author: 


1. Divide each item of original data by the item for the same season of the 
preceding year. 

2. Compute the arithmetic average of these relatives. 

3. The variation from 100 of the average above represents trend residue. 
Divide the arithmetic average of the original data by each item of original data 
and multiply the result by the trend residue. This latter is the correction to 
remove the trend residue which is applied to each relative obtained in the first 


step. 
4. The remainders after the subtraction of this correction are the relatives of 
cyclical differences which we set out to obtain. 


It is apparent from this that the “relatives of cyclical differences” given by 
Dr. Brumbaugh’s method are chain relatives for items each a year apart, less a 
correction to remove trend. They resemble in their general characteristics index 
numbers which take the value for each corresponding month a year earlier as 100. 

The author makes the following claims for his method: 

1. No regular or arbitrary type of trend is assumed. 

2. Progressive changes in character or amplitude of seasonal variation do not 
seriously affect the results. 

3. The amplitude of the relatives will vary with the amplitude of the cyclical 
changes. 

These conclusions are largely based upon an examination of the way the 
method works, first with a series of artificial data, and then with three selected 
series; and a comparison.of the new “‘relative cyclical differences’’ with the relative 
cycles obtained when the same data are analyzed by “orthodox” methods. 

From the analysis of the specially constructed series (a mathematical curve 
with pure sine elements for seasonal and cyclical) the author concludes that his 
method does satisfactorily remove all elements of trend and seasonal. As it 
happens, however, the series he uses has a cyclical element with an amplitude 
five times as great as the amplitude of the seasonal element. Tests by the re- 
viewer with similarly-constructed series where the seasonal variation was rela- 
tively more important show that the accuracy of the method varies with the 


1 Karl Karsten, “The Theory of Quadrature in Economics,” this Journat, Vol., XIX, pp. 14-29. 
2 Irving Fisher, “The Dance of the Dollar,” this Jounnat, Vol. XVIII, pp. 1024-1028. 1923. 
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relative importance of these two elements; the greater the seasonal element, the 
less accurately the method reveals the true cyclical component. 

There seems to be no fundamental basis for distributing the “trend residue”’ 
in the particular way that the author directs in step 3. Dividing each monthly 
item by the item for the corresponding month of the year before has already 
completely eliminated seasonal, in so far as the seasonal variation is the same per 
cent of trend both years. Varying the correction factor in inverse proportion to 
the absolute size of the original items results in restoring a certain amount of 
seasonal fluctuation to the “relative cyclical differences.” In series such as egg 
prices, where the seasonal movement is the largest single element, the final figures, 
by Brumbaugh’s method, would still contain a residual seasonal element. 

So far as this method does correct for trend at all, it practically assumes one of 
the simplest and most rigid types of trend—a constant rate of increase or decrease. 
The rate of change correction is, it is true, slightly modified from month to month 
in inverse proportion to the size of the original items, but that does not change the 
general slope materially. This trend is not explicitly shown at all, but its exist- 
ence is necessarily assumed in the process. Since year-to-year differences are 
used, this only affects the result to the extent to which the trend over each period 
is improperly measured; the inaccurate allowances for trend are, therefore, some- 
what minimized in the final result. It is hardly true, however, that “no regular 
or arbitrary type of trend is assumed.” 

One characteristic of the new method which Dr. Brumbaugh points out is that 
it forecasts changes in the direction of swing under certain conditions. With a 
pure sine curve having a period greater than two years, the greatest relative dif- 
ference between two points on the curve a year apart is reached ahead of the 
crests and the troughs. Economic fluctuations are not pure sine curves, however, 
nor are troughs or peaks definitely marked; and it does not appear from Dr. 
Brumbaugh’s results with actual data that this will be much aid to mechanical 
forecasters. 

Dr. Brumbaugh’s discussion of the meaning of trend and seasonal, both as to 
the possibility of change in each, and the impossibility of absolute or final meas- 
urement of either, is very stimulating. As has been indicated, however, the new 
technique offered does not overcome these difficulties. The theoretical analysis 
here has gone far beyond the actual methods evolved. 

The greatest difficulty of the new method is in interpreting the results. The 
relative cyclical differences are a rather involved function of the underlying 
cycles; and no method is offered for transforming them back into more under- 
standable terms. As they stand, they represent a concept which will be difficult 
for statisticians to work with, and will be even more opaque to the average busi- 
ness man than the general run of statistical devices. Computed by methods 
which savor strongly of the rule-of-thumb, and intelligible only in terms of a 
complicated mental transformation, it may be doubted if the new technique will 
prove satisfactory either to the statistician or the layman. 

In fairness to Dr. Brumbaugh, it must be admitted that he leaves it to “the 
final test of experience” to determine the worth of his method, and does not claim 
a place for it solely on the basis of his tests. It may well be that its rapidity and 
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ease of computation will make it a valuable aid for the preliminary analysis of 
data characterized by simple trends and relatively mild seasonal fluctuations. 
For the complete determination of cyclical movements, however, it is much to be 
doubted if it will replace the more laborious but more exact methods with separate 


determination of trend and seasonal. 


Bureau of Agricultural Economics, Morpecat Ezexiev 


United States Department of Agriculture 


Employment Statistics for the United States, edited by Ralph G. Hurlin and 
William A. Berridge. New York: Russell Sage Foundation. 1926. xvi, 
215 pp. 

In 1922 the American Statistical Association appointed a committee, now 
known as the Committee on Governmental Labor Statistics, and this volume 
“attempts to present the consensus of opinion of the members of the commit- 
tee . . . concerning problems involved in the collection and publication of 
adequate employment statistics for the United States.” 

The report is timely. According to the index of employment in manufacturing 
industries published by the Bureau of Labor Statistics, the general trend of 
manufacturing employment has been downward since 1920 and, in the absence 
of employment indexes for non-manufacturing pursuits, it cannot be said whether 
the shrinkage in manufacturing has been offset in any degree by expanding em- 
ployment in other fields. The committee recommends that the data now col- 
lected on manufacturing—the total number on the payroll and the total wages 
paid in one payroll period of the month—shall also be secured for mining and 
quarrying, communication, building, construction, wholesale trade, retail trade, 
logging and lumber work and agriculture. 

In a chapter on the Sources of Employment Statistics the report considers as 
statistical measurements of employment: (1) counts or estimates of the number 
unemployed; (2) statistics of demand for labor and applications for work as 
registered in employment bureaus; and (3) periodic counts of number of 
persons employed as shown by payrolls. The conclusion is that the payroll is the 
most feasible source of statistics relating to employment in the United States. 
The development of payroll employment statistics is outlined, and it is shown that 
some of the states are already securing such statistics from industries other than 
manufacturing. But “the great majority of states have no information regard- 
ing employment within their own boundaries, although manufacturing plants in 
them may contribute to the data collected by the Federal Bureau of Labor 
Statistics for the country as a whole.” 

The committee puts forward eleven recommendations as a plan for obtaining 
employment statistics for the country. It is suggested that the initial responsi- 
bility for collecting such statistics should rest upon each state, while the Federal 
Bureau of Labor Statistics should coérdinate these state reports and the employ- 
ment statistics gathered by other federal bureaus in their administrative work 
and publish them in one periodical report on employment throughout the nation. 
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The eleven recommendations of the plan indicate facts to be secured and methods 
to be followed in collection and publication; several chapters are devoted to the 
detailed procedure of collection from manufacturing and non-factory industries, 
tabulation, construction of employment indexes and publication of results. 
Samples of forms used by a number of statistical agencies are given in an appen- 
dix. The plan is logical and clear-cut to a degree and if the principle of federal- 
state codperation is accepted it leaves nothing to be desired. 

One is curious as to the considerations which led the committee to recommend 
federal-state codperation. The procedure would be greatly simplified if the 
Bureau of Labor Statistics at Washington would collect payroll data directly 
from establishments throughout the country, except in so far as this is already 
being done for some industries through the administrative functioning of certain 
federal departments, and would publish indexes for the nation, the states and the 
larger cities. The editors suggest that placing the initial responsibility for col- 
lection upon each state “is worthy of note as constituting an effective working 
relationship which keeps the advantage of centralization in Washington, while 
retaining the convenience of local collection and use.” But does experience in 
federal-state codperation in statistics give promise of any such result? Would 
not some of the state governments refuse to have any part in the plan, while 
others would do the work so poorly that their reports could not be accepted? 
Would rot the registration-area phenomenon of birth statistics appear also in 
employment statistics? The assertion that ‘‘each state needs to include a larger 
number of establishments in its own area to reflect industrial conditions than the 
Federa] Government needs from the same area for a national index”’ is open to 
question. Samples of different sizes ms show different employment trends as is 
now the case in some industries as between the reports of the Census of Manu- 
factures and the Bureau of Labor Statistics. Reports of an establishment 
important for a state employment index should be significant for a national index. 

Even if all the states did coéperate there would be forty-eight statistical units, 
most of them poorly equipped, with as many overheads for the taxpayer and 
almost as many standards of performance and degrees of codperation with the 
central organization. Every month some states would be tardy in forwarding 
their compilations. The original data scattered among the state bureaus would 
be quite inaccessible for further research. There would be forty-eight appropria- 
tions to be secured from the state legislatures rather than one from Congress, and 
with the development of new programs and methods the state officials would 
have to be converted to every change. Except in the cases of a few states already 
in the field a central bureau at Washington could collect, compile and publish 
payroll data much more promptly and accurately and certainly at less expense. 
But perhaps this is not within the field of present practical politics. The com- 
mittee may have decided that the United States Bureau of Labor Statistics 
would not be able to undertake the task for a long time and then suggested what 
seemed the only feasible alternative. One wonders, however, if their plan holds 
any better promise of early fulfillment. 

Bryce M. STEWART 


Industrial Relations Counselors, Inc. 
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Sur les Semi-invariants et Moments Employés dans’ Etude des Distributions Statis- 
tiques, par R. Frisch. (From Transactions of the Norwegian Academy of 


Science) Oslo. 1926. 91 pp. 


In the analytical determination of the numerical values of the constants or 
parameters of a statistical material or a collection of observations three types of 


symmetrical functions play an important rdle, viz.: 
1. The well-known moments m, defined by 


m, ==(x —a)’F (x), or m, = 2’ F(z). 


2. The semi-invariants \, defined by 
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3. The factorial moments m"! defined by 
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x 
where ( *) is the usual combinatorial symbol. 


The moment concept was already introduced and applied by Laplace and 
Poisson. It was further developed by Thiele and received what practically 
amounts to a complete systematic treatment in the hands of Karl Pearson and his 
disciples. The semi-invariants were first introduced and systematically de- 
veloped by Thiele in his remarkable Almindelig Iagttagelseslere published in 1889.' 
The factorial moments are of a comparatively recent origin, having been intro- 
duced in mathematical statistics by Professor Steffensen of Copenhagen in 1923. 
But in spite of the undisputed fact that both the semi-invariants and the facto- 
rial moments possess certain properties, which often make them preferable to the 
usual moments, practically all of the English-speaking statisticians have sadly 
neglected to avail themselves of this fact with the result that many of their dem- 
onstrations become unduly complicated and unwieldy. 

Since both moments, semi-invariants and factorial moments are symmetric 
functions, it is, of course, possible to express one in terms of the other, as actually 
has been done by Thiele in the fourth chapter of his Almindelig Iagttagelseslere, 
and later on by Steffensen in his derivation of the factorial moments. 

One of the achievements of the present memoir by Dr. Frisch is that it treats 
the three systems or types of symmetrical functions from a common point of 

1 My friend, the late Mr. Vigfusson, pointed out to me in 1922 that Thiele’s definition of semi-inva- 
riants in a limited sense already is found in Lacroix’s treatise on the calculus. Lacroix did, of course, 
not treat the relation between the theory of symmetrical functions and the theory of observations (or 
statistics). Traces of the semi-invariants are, on the other hand, found in the writings of Bessel. But 
to Thiele belongs the honor to have given the first comprehensive and systematic treatment of that 
particular form of symmetric functions, and to have applied his theory to observational or statistical 


data. Lest the casual reader should get confused in the way of nomenclature, it might be well to point 
out that Thiele’s semi-invariants are not the same as the semi-invariants of Sylvester in the algebra of 


quantics. 
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view, taking its root in the mathematical number theory and especially in the 
type of numbers associated with the name of Bernoulli. In this manner Dr. 
Frisch reaches in his first chapter some very general formulas from which he on 
the one hand derives the relations between semi-invariants, moments and facto- 
rial moments, as established by Thiele and Steffensen, and on the other hand 
applies in his treatment of incomplete moments and their relations to the in- 
equalities of Tchebycheff, Hélder, Jensen and Steffensen. 

In Chapter II Dr. Frisch discusses the parameters of the point binomial from 
both the standpoints of semi-invariants and moments. The most interesting 
part of this chapter appears to be the decidedly elegant manner in which he de- 
termines the exact expression for the inccmplete moment of the point binomial in 


the form 


t s—t 


My = 2 (x—sp) P, =tgP: =tg (‘)4 
z=t Pq 


The reader, comparing Dr. Frisch’s treatment of the incomplete binomial with 
the corresponding treatment by Pearson and his scholars, cannot fail to be im- 
pressed by the more elegant and certainly less laborious demonstration by the 
Norwegian scholar, compared with which some of the demonstrations in Bio- 
metrika appear almost clumsy. In this connection it might be well to mention 
that Dr. Frisch points out that the formula given in Biometrika (Vol. XV, 1924, 


p. 202) as 
. 8 Pp 
Bo= P, =i ) [2 (1—z2)*~“dzx 
r=t t o 


already was given by Laplace in a slightly different form. Incidentally we 
might also point out that the formula in the same notation as that of Pearson’s 
is found in Meyer’s treatise on probabilities, published in Bruxelles in 1874. 
Moreover, the eminent American telephone engineer, Mr. E. C. Molina, used 
Laplace’s formula as a starting point for important extensions in relation to the 
Poisson exponential in an interesting article which he wrote for the June, 1913, 
number of the American Mathematical Monthly. It appears, therefore, that 
much of the present feverish activity among the students of the Pearsonian 
school along these lines contains comparatively little in the way of novelty or 
originality. 

Dr. Frisch has also in his memoir given us an important contribution in a series 
of approximate formulas which enables us to compute with very little labor or 
trouble approximate values for the incomplete binomial. Asan example he gives 
the calculation of the upper tail end of the point binomial 


(4+4%)" from t=7. 


Apart from the denominator, 531,441, the exact value of this expression is 
35,313, while the three approximations from Dr. Frisch’s formulas are 39,424, 
38,614 and 35,953 respectively, indicating an error of less than 2 per cent in case 
of the third formula. In view of the fact that the older methods, as well as the 
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Pearsonian methods, are rather lengthy and time-consuming, a close study of Dr. 
Frisch’s method might very well prove profitable to those statisticians who have 
occasion to work with incomplete binomial functions. (For higher exponents 
the approximations are even closer than the above example.) 

The third chapter deals with the parameters of the hypergeometric series and 
their relation to the binomial series, and introduces the Laplacean concept of 
generating functions. As shown by Steffensen, the introduction of factorial mo- 
ments often lessens the analysis of the hypergeometric series. This chapter is 
perhaps the most difficult of Dr. Frisch’s brochure, and the present reviewer 
hesitates to offer any critical comments thereon. 

In conclusion it should be pointed out that the statistician who is not thor- 
oughly grounded in higher mathematical analysis and the theory of numbers will 
find little cheer in the Frisch memoir, and our Norwegian scholar’s mathematical 
treatment is perhaps too formidable and goes over the head of the average student 
of statistics. These difficulties should, on the other hand, not alarm, but rather 
invite the mature student to tackle this brochure. A careful reading of Frisch 
will not alone prove stimulating, but yield practical benefits as well to those who 
have the time and mental energy to do a little digging on their own account. 
For this reason the present reviewer considered it not alone a privilege, but also 


a pleasure to prepare the above comments. 
ARNE FIsHER 


The National Income, 1924. A Comparative study of the income of the United 
Kingdom in 1911 and 1924, by Arthur L. Bowley and Sir Josiah Stamp. New 
York: Oxford University Press, American Branch. 1927. 59 pp. 


In this book the authors bring down to date, as of 1924, a previous estimate 
of national income. It is, as well might be expected, an excellent example of 
skillful statistical work. The task calls for ingenuity in finding and filling de- 
ficiencies and gaps in quantitative data by estimates that are well-grounded and 
well-reasoned guesses, rather than applications of formally mathematical ideas. 
Skillful work in such matters supposes not only familiarity with available data, 
but also a high degree of conversance with the underlying concrete facts, the 
latter requirement being especially exacting, even when thus qualified for degree. 
An acceptable result also supposes rare powers of statistical analysis. 

At this distance it would, of course, not be easy, if there were occasion, to 
check up the results. Some conclusions are of special interest. 

The authors find that, whereas aggregate income in Great Britain and North 
Ireland increased 105 per cent in the thirteen years, social income, reached by 
subtracting transfers (pensions, relief, education, etc.), increased only 90 per cent. 
Wages have held their own, in terms of proportion of the total, despite a reduc- 
tion of the working week by about 10 per cent and an increase of unemployment 
by oue-twentieth of the normally occupied population. The proportion of earned 
income other than wages has slightly increased. Taxes have increased, in terms 
of proportion, from 11 per cent to 20.2 per cent of taxable income, most of the 
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difference being accounted for by national debt and pensions. ‘We think that 
the real home-produced income per head (when duplication is eliminated) did not 
differ appreciably from that in 1911,” but consideration of indexes of physical 
production and of other evidence suggests that it ‘may have been 4 per cent less.” 
Savings were 16 per cent of total social income in 1911 and, largely owing to bad 
trade and unemployment, 12 or 13 per cent in 1924. The percentage of net in- 
come going to the richer section is considerably less than it was in 1911. 

Of course these statements, apart from their context, are scarcely more than 
suggestive. The conclusions of the authors are not rosily optimistic, but they do 
not indicate that the British people need feel discouraged. 





G. P. WATKINS 
Washington, D. C. 


Cyclical Fluctuations: Retail and Wholesale Trade in the United States, 1919-1925, 
by Simon S. Kuznets. New York: Adelphi Company. 1926. xx, 201 pp. 


This book bears ample evidence of the diligence of a rising young statistician 
who also essays unusually sustained flights of economic interpretation. 

The materials of the study are the familiar data on monthly sales, compiled 
monthly since January, 1919, by the Federal Reserve Board in coéperation with 
the Reserve Banks. Few if any of these many trade series have escaped the 
attention of Dr. Kuznets in preparing his thesis. The presence of forty-one 
charts and sixty-four tables attests the comprehensiveness of the author’s interest. 
Most of these materials present indexes of the sales data adjusted for secular 
trend, for seasonal variation, and in many cases for price changes as well; a few 
concern selected economic correlatives of trade, such as industrial production, 
member bank loans and discounts, savings bank deposits, and the money income 
and commodity purchasing power of factory employees. 

In his statistical technique, Dr. Kuznets has taken many chances. The method 
apparently was to use whatever period was available for calculating secular 
trend and seasonal variation (usually January, 1919, through July, 1924) without 
any adequate defense of the suitability of the period for either purpose. Through 
“blanket procedures” of this sort it is possible to secure misleading results, es- 
pecially when the period is so short, and subject to such marked changes in trade 
conditions, as were these five and a half years. If Dr. Kuznets made rigorous 
technical tests of statistical validity, it would have been helpful to show at least 
some of them as supporting evidence. Much of his subsequent argument hinges 
on the comparative amplitudes of fluctuationsin the several series—an argument 
which necessarily postulates the validity of secular trends and seasonals. 

Perhaps the most significant portion of the book is Chapter II, dealing with 
price deflators. Dr. Kuznets has gone further than anyone else, with the possible 
exception of Carl Snyder, in treating this difficult problem. He has not been sat- 
isfied with ready-made deflators, but has patiently “rolled his own’”’ by methods 
described in the Appendix to that chapter (pp. 117-128). New price indexes are 
presented for retail groceries, wholesale groceries, department-store goods, whole- 
sale dry goods, wholesale drugs and pharmaceuticals, and wholesale hardware, 
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with a general weighted index intended to represent all wholesale lines covered by 
the Federal Reserve Board. All students of value series will be grateful to Dr. 
Kuznets for his patient pioneering in deflation of price factors. The deflated 
series which he secured by applying his price indexes appear to be the mostreliable 
yet available. 

Chapter II, on Cyclical Fluctuations in the Distribution of Incomes, contains 
much that is interesting to the layman, although little that is essentially new 
except a synthesis of earlier writers’ findings. The point chiefly emphasized is 
that incomes in the form of wages, salaries, dividends and interest payments 
show greater cyclical stability than does the production of manufactures. 

Chapter IV contains a painstaking explanation of the reasons for differences 
in amplitude (c) of the adjusted indexes for (1) retail trade, (2) wholesale trade, 
and (3) industrial output. The author sets out with the statement that “‘mere 
speculation would, at the outset, lead us to assert the necessity of strict and rigid 
correspondence between retail and wholesale sales, wholesale sales and manufac- 
turing output without any allowance for a different variability.” This reviewer 
fails to follow the logic supporting the statement, and doubts that any economist, 
business man or statistician, well informed on manufacturing, merchandising, and 
stock-maintenance policies, would hold such an a priori view. It presupposes too 
high a vacuum of factual information; it is “non-Euclidean.” However, the 
author proceeds to demolish the result of the alleged speculation, by a most care- 
ful and thoughtful, if somewhat labored, process of economic analysis. From 
the facts pointed out in the three initial chapters, he shows rather conclusively 
that the physical volume of retail trade not only does, but perforce must in the 
present organization, fluctuate cyclically within narrower limits than does that of 
either wholesale trade or industrial output. 

W. A. BERRIDGE 


Metropolitan Life Insurance Company 


Migration and Business Cycles, by Harry Jerome. New York: National Bureau 
of Economic Research Inc. 1927. 256 pp. 


Many good books have: been published on the immigration problem, but the 
subject matter of Migration and Business Cycles is handled in an entirely new 
way. It is an interesting statistical study of the relation of immigration and 
emigration to changes in the business and employment conditions in the United 
States and in the countries of emigration. In the characteristic way of the care- 
ful statistical analyst, the author defines the objects of his inquiry in clear-cut and 
precise terms, as follows: 

1. To what extent do cyclical and seasonal fluctuations in migration correspond, in 

time and degree, with fluctuations in industrial activity, particularly as 
measured by employment or unemployment? 


2. What noteworthy variations in cyclical and seasonal fluctuations appear when 
migrants are classified by sex, prior occupation, race or country of origin? 
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3. What is the relative influence of the “push” or the “pull” upon fluctuations in 
migration, that is, are such fluctuations primarily determined by changes in 
the country of emigration or in the country of immigration? 


4. What is the economic significance of the ascertained tendencies? 


The reader who is not familiar with technical statistical methods, or is inclined 
to become rather seasick by riding upon the violent waves of statistical curves, 
is encouraged by apt chapter summaries which explain in plain language the 
meanings of the statistical data. 

In the second chapter of his book, Dr. Jerome reviews the movement of immi- 
gration to the United States from 1820 to 1924, and traces the opportunities for 
the employment of immigrants in this country, and points out the major indus- 
tries in which immigrants are employed, thus laying the foundation for his 
subsequent analysis of the relations between employment conditions and immi- 
gration movements. In the following chapters of his work, the author compares 
available immigration statistics with (1) annual statistics of imports or merchan- 
dise, (2) annual statistics of pig-iron production and quarterly statistics of im- 
ports of merchandise, (3) estimates of factory employment and of pig-iron pro- 
duction, and (4) short-period indices of employment conditions. 

Throughout his study Dr. Jerome finds substantial evidence of close correlation 
between business conditions and immigration. According to the author, ‘The 
cyclical fluctuations in emigration are the inverse of the cyclical fluctuations in 
immigration. When industry booms, immigration increases and emigration 
decreases; when industry is dull, immigration declines and emigration increases.” 
(p. 107) 

A comparison between employment conditions in the principal countries of 
emigration and in the United States shows that times of prosperity in this coun- 
try are also times of prosperity in the countries of emigration, and that periods 
of depression in this country are synchronous with periods of depression in the 
countries of emigration. A noteworthy exception is found in the case of Italy, 
where there is a lack of distinct relationship in the business cycles as compared 
with this country; so that periods of prosperity in this country occurred during 
times of relative depressions in Italy. From these comparisons the author ar- 
rives at the conclusion that immigrants leave when employment conditions are 
best in the countries of emigration as well as in the United States. He calls 
attention, however, to the fact that even in times of industrial depression in this 
country there have been substantial increments in the working population due to 
the excess of immigration over emigration. 

Dr. Jerome’s book is one that should be read by every one interested in the 
controversial question of the causes of immigration and the relation of immigra- 
tion to employment and unemployment. Migration and Business Cycles is de- 
voted to the question of when the immigrant comes to the United States rather 
than to the question why he comes. The author is fully aware of the fact that 

economic considerations are not the sole determinants of migration, but he as- 
sumes the economic factor to be the determining cause of immigration. ‘It 
will be granted,”’ he says, “that the hope of economic betterment is not the sole 
motive for emigration. Religious or political persecution, racial discrimination, 
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or the mere love of adventure may be the impelling force. But in the main, the 
emigrant is a seller of labor, seeking the best prices for his services, and hence not 
apt to be attracted by a stagnant market.” After reading the book, I am not 
convinced that economic considerations are the dominant reasons for immigra- 
tion, in spite of the fact that the author had established beyond doubt a definite 
and close relationship between the inflow of immigrants and employment con- 
ditions. The men and women who came to the United States in such large 
numbers prior to the World War and for a short time after the World War might 
have been actuated by reasons of religious and political persecutions, but found 
it best to migrate when opportunities for finding work in the United States were 
best. Since most immigrants, and for that matter most people, earn their living 
by working for some one else, they must time their migration with employment 
conditions in the country to which they migrate. Therefore, the very fact that 
immigrants come here when employment is most likely to be secured does not 
prove that the likelihood of more profitable employment is the real cause of mi- 
gration. It may be the immediate cause only. The answer to the why of mi- 
gration is to be found in a study of living and working conditions, here and abroad, 
of the respective races constituting the bulk of our immigration. The Italian, 
who has been called the “bird of passage,” might come to this country principally 
to make a stake and to return to his sunny Italy. The Hebrews have come prin- 
cipally to avoid religious and political persecuticns; others might have come only 
because of a desire to join their relatives. And to what extent are politics and 
political reasons responsible for the migration of the Irish? All races may find 
it most convenient to come to the United States in times of prosperity, but the 
fact of prosperity alone cannot be accepted as the determining factor of tides of 
all immigration. 

It would be enlightening to study the reasons for interstate migration in the 
United States. For instance, if data were available, one would possibly find a 
close relation between business conditions in California and the permanent migra- 
tion into California. But, will that mean that the people migrate to California 
because of the hope of better jobs? I do not think so. Most people come to 
California because of climatic and health reasons, even though they might come 
in largest numbers when employment conditions are best here. 

Louis BLocu 


Bureau of Labor Statistics, San Francisco 


The Income Tax in Great Britain and the United States, by Harrison B. Spaulding, 
Ph.D. London: P.S. King & Son, Ltd. 1927. 320 pp. 


This volume is devoted to a comparative study of income tax legislation and 
administration in Great Britain and the United States. After a brief discussion 
of the history of the income tax in each country, the existing laws are analyzed 
in detail, the merits and demerits of each being carefully compared. The follow- 
ing selected topics indicate something of the scope of the study: (1) Rates of tax, 
(2) persons liable to tax, (3) problem of double taxation, (4) capital gains and 
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casual profits, (5) exempt income, (6) wasting assets, (7) collection of tax at the 
source, and (8) administration. 

It is rare to find a book dealing with a subject of this kind which is so readable 
as the present volume. The author displays remarkable ability in extracting the 
essentials from the laws of the two countries and in making comparisons which 
bring the important contrasts into relief. The relative merits and demerits of 
the systems prevailing in the two countries are analyzed in a searching and yet 
unbiased manner. Dr. Spaulding’s conclusions appear to the reviewer so well 
founded that it is difficult to take issue with any of them. In only one respect 
does it seem probable that the author has been unconsciously influenced in favor 
of Great Britain’s law by the fact that he is accustomed to British conditions. 
In the opinion of the reviewer, Dr. Spaulding has assigned more importance than 
is justified to the fact that, in Great Britain, the process of appeal by the income- 
tax payer is much simpler than in the United States. Most Americans will 
doubtless be surprised to learn that, because of the British method of collecting 
income tax at the source, the number of claims for re-payment has run as high as 
1,991,000 in a single year, although but 2,515,000 people were chargeable with the 
tax. Obviously, under such circumstances, it is imperative that the taxpayer 
have easy access to the income tax officials. In the United States, on the other 
hand, where but a small minority of all tax payers have any adjustments made 
in their self-calculated income tax, and where but a trivial fraction find it neces- 
sary to enter appeals from the decision of the collector, there is much less occasion 
for laying stress upon the lack of convenient facilities for appeal. 

The evidence of Dr. Spaulding’s study seems to the reviewer to indicate that, 
for the great majority of taxpayers, the American system is more convenient 
than the British, and that ours is also much simpler from the administrative 
standpoint. As to whether the British procedure offers greater resistance to 
evasion, one is still in doubt. 

Persons interested in income tax laws and administration will profit much by 
reading this excellent book. 

Wittrorp I. Kine 


Mathematics for Engineers, by Raymond W. Dull. New York: McGraw-Hill 
Book Company. 1926. xvii, 780 pp. 


Mr. Raymond Dull presents in his book Mathematics for Engineers a practical 
assemblage of the mathematical field, comprising 57 chapters. The introduction 
explains that much of the material presented is based on notes which the author 
was in the habit of making in his capacity as consulting engineer. A quotation 
from the preface follows: “This treatise on mathematics has been prepared pri- 
marily for engineers. In this we would include (1) engineers who want a quick 
and convenient seference, (2) engineers who have grown somewhat rusty in their 
mathematics, and (3) engineers who feel the need of a text for the study of mathe- 
matics.” 

This quotation explains the author’s point of view throughout the book, 
namely, to give as far as possible the intermediate steps required for each mathe- 
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matical problem. With this in mind Mr. Dull has selected the material in such 
a way that while a particular example might at first glance seem specific, there is 
enough mathematical margin in reserve to apply the same method to other re- 
lated problems. 

In thus handling this subject the book is more than merely a collection of 
examples. Mathematics for Engineers is of interest to a group of scientific 
workers outside the specific class to which the title of the book refers. 

The author is master of the graphic algebraic solution and shows by numerous 
examples that he considers the graphic algebraic method of fundamental impor- 
tance in the approach to many problems. On page 86, for instance, there is a 
discussion of ‘‘Graphical Solution of Additional Problems.” Mr. Dull there pre- 
sents the well-known problem of the flow of water at different delivery rates from 
different faucets into a common tank, and the problem resulting is to show graph- 
ically the amount of water in the tank at any instant. The solution is simple, 
but at the same time can be conveniently applied to items such as shipments, 
production, collections and other activities which work against time at different 
rates and for which the resultant at any instant is desired. Some years ago the 
writer made use of the method presented here in the analysis of an overtime 
problem in relation to desired production. Another interesting and important 
application of this elementary relationship is brought out in Mr. Karsten’s con- 
tribution relating to the Harvard Business Indexes,? where the “quadrature or 
cumulative relation,” as Mr. Karsten calls it, is shown by the inflow and outflow 
of water into a reservoir together with the record of the heights of the water-level 
in the reservoir at the same time. There are numerous other problems which 
come within the scope of linear or first-degree equations, and in most cases the 
author shows the algebraic and graphic solution jointly. 

Chapter VI, page 99, treats quadratics and second-degree equation in like 
manner. The author is particularly concerned with showing the graphic method 
for finding the roots of an equation, a very helpful way in clearing up any points 
which cannot be visualized in their full significance, if considered without 
diagram. 

Chapter XV deals with logarithms and their application. Here, too, the expo- 
sition is clear and supported by various examples; the material presented will 
serve as a welcome reference in all cases of doubt. The natural sequences are 
then “Exponential Functions and Their Relation to Logarithmic Functions,” 
as described in the chapter following examples of the “ Power-Function Graph,” 
as Mr. Dull calls it, and are useful in illustrating situations where the dependent 
as wel! as the independent variable are plotted as logarithmic coérdinates. 

Whenever statistical data are plotted in this manner and the points of observa- 
tions arrange themselves closely in straight-line trend (using logarithmic coérdi- 
nates), the equations of the phenomena can be determined with minimum 
effort, bearing in mind that the tangent functior of the slope angle is the value 
of the exponent of the independent variable. 

1 R. von Huhn, “Graphic Analysis of an Overtime Problem,’’ Industrial Management, February, 1919, 


p. 86. 
? Karl G. Karsten, ‘The Harvard Business Indexes—A New Interpretation,’ this Journat, Decem- 


ber, 1926, p. 403. 
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Mr. Dull then devotes a chapter to the slide rule, the indispensable instru- 
ment for engineers and other scientific workers. 

Chapter XXIV gives the graphic presentation of trigonometric functions, a 
feature which many books, dealing with the same subject, do not carry. The 
graphic presentation of the sine, cosine and other trigonometric functions is not 
only extremely helpful, but once understood it makes errors almost impossible, 
specially when the correct signs are wanted for angles greater than 90°. 

The differential and integral calculus are treated in the last part of the book, 
the material covering some 216 pages. For those who see in the science of 
mathematics a necessary evil and advocate “‘common sense” as a substitute in 
place of mathematical reasoning, the remarks of Dr. Bliss at the recent annual 
meeting of the American Association for the Advancement of Science at Phila- 
delphia should be of interest, namely, ‘Common sense should be scrupulously 
avoided in places where common sense does not apply. No amount of common 
sense, unaided, can predict the motions of heavenly bodies or compute a range 
table.” ! 

In discussing Mr. Wood’s paper,? the same opinion was expressed in another 
way at a recent meeting of the Royal Statistical Society, London, February 15, 
1927. Finding it difficult to follow the method of his reasoning, Mr. John 
Hilton stated: “‘ As a Society, one of our objects is to make all our work clear to 
the general public, who do not know—or wish to know anything about statistical 
methods; and now, after my plea for clarity, I am going to contradict myself 
straight away. I am not sure that Mr. Wood would not have simplified it by 
making it more complicated. If he had inserted one algebraic formula showing 
how he made up his calculations he might have cleared up the whole matter.” 

In presenting the text of the concept of the differential calculus the author 
seems well aware of the fact that the understanding of the rate of change, instan- 
taneous rate of change, and the limit-theory are cornerstones of the foundation 
of the science of differentiation and integration, and that to grasp these concepts 
fully means successful operation in the constructive use of differentiation and 
integration. Mathematics for Engineers represents a new addition to scientific 
literature in which both author and publisher have given their best effort. 

R. von HunN 


Ste_+tical Methods for Research Workers, by R. A. Fisher. Edinburgh and Lon- 
m: Oliver and Boyd. 1925. ix, 239 pp. 


Most books on statistics consist of pedagogic rehashes of identical material. 
This comfortably orthodox subject matter is absent from the volume under re- 
view, which summarizes for the non-mathematica! reader the author’s independ- 
ent codification of statistical theory and some of his brilliant contributions to 
the subject, not all of which have previously been published. 

The theory of probable errors in ordinary use is valid only in the limit as the 

1 Mechanical Engineering, February, 1927, pp. 180-185. 


?G. H. Wood, “ An Examination of Some Statistics Relating to the Wool Textile Industry,"’ Journal 
Royal Statistical Society, Vol. XC, Part II, 1927, pp. 272-325. 
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size of the sample approaches infinity. The author has created most of the exist- 
ing theory of small samples. It is therefore natural-that the book deals largely 
with methods which must be used where the number of cases is limited and which, 
he holds, may well be used in general. Common occurrence in economic and 
other statistics of short series will make the work valuable to a larger class of 
research workers than the biologists for whom it was primarily intended. 

Of particular interest are the methods of evaluating the significance of correla- 
tion coefficients drawn from small samples, the tests of significance of differences 
of means, and the method (p. 125) of fitting a polynomial to a series of observa- 
tions by adding terms one at a time until the fit is sufficiently good. All these 
are due entirely, I think, to Mr. Fisher’s own researches. Some of the tables, 
particularly V (A) which gives the values which a correlation coefficient must 
attain in order to reach certain levels of significance, are indispensable for the 
worker with moderate-sized samples. 

The author (who must not be confused with Arne or Irving Fisher) has an 
answer to the query as to the singular of “statistics.” A statistic is a quantity 
such as an average or correlation coefficient which summarizes characteristics of 
a body of data relevant to a particular inquiry. The eriterion of “maximum 
likelihood” for the derivation of statistics which he invented is illustrated. His 
position regarding the number of classes with which tables of x? must be entered 
is clearly set forth. 

Some slight preliminary knowledge of statistical concepts is needed to make the 
reading smooth. The absence of proofs and the omission of some topics make 
the book inadequate for mathematical readers, who will find it desirable to read 
also the author’s original papers. If used in a course in mathematical statistics 
it should be supplemented with these proofs. It would be impossible to combine 
in such brevity a compendium of sound practical methods with a complete theo- 
retical development. The author’s work is of revolutionary importance and 
should be far better known in this country. 

Haroip Hore.linG 


Stanford University 
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